


2 





2 
one 
© 
ae 
ik 
ae 
ats; 
< 
a 
oe 
es 
art 
a 











JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 





VoLuME 48 DECEMBER 1953 NuMBER 264 





ARTICLES 


Statistical Problems of the Kinsey Report 
Wiuuiam G. Cocuran, FREDERICK MostTE.uer, ‘and Joun W. Tuxey 
TheInventory Problem . . 
J. LADERMAN, S. B. LiTravEr, and LioneL WEISS 
Census Tracts and Urban Research Dona.p L. FoLrey 
On a Probability Mechanism to Attain an Economic Balance Between the 
Resultant Error of Response and the Bias of Nonresponse. . 
. Epwarps DEMING 
Effect of Weighting by Card-duplication on Efficiency of Survey Results. 
Irnvinc RosHWALB 
The Mathematical Basis for the Bean Method of Graphic Multiple Correla- 
tion . Ricuarp J. Foote 
Recent Advances in Finding. Best Operating Conditions 
R. L. ANDERSON 
A Note on Regression when there Is Extraneous Information about One of 
the Coefficients . J. DuRBIN 
A Hollerith Technique for the Solution of Normal Equations a a 
M. J. R. He+.y and G. V. Drxe 
The Use of Runs to Control the Mean in Quality Coutrol . H. WEILER 
Truncated Poisson Distributions . . . Paut R. River 
Percentage Points of the Incomplete Beta ‘Function 
Rosert E. Ciark 
Bibliography of f Nonparametric Statistics and Related Topics. . 

, _ I. Ricuarp SavacE 
Errata. : - eS 
Random Digits 
Index to Volume 48, 1953 (Nos. 261-264) . 


BOOK REVIEWS 


Garrett, H. E., Statistics in Psychology and Education, Fourth Edition 
Freperic M. Lorp 
Toues, N. "ARNOLD, and Raion, Ropert a8 Sources of Wage Informa- 
tion: Employer "Association ‘ M. I. GERSHENSON 
Revue de Statistique Appliquée. Volume 1, No. i 1953 
Nysién, G6ran, The Pr oblem of Summation in Economic Science. A Meth- 
odological Study with Applications to Interest, Money and Cycles . 
Joun S. CHIpMAN 
ORGANISATION oF European Economic CoorERATIONS, "Measurement of 
Productivity Peter O. STEINER 
Sreck., Irvina, H. , Concepts and Measurement of Production and Productiv- 
ity io ee che eae car Artur L. Brorpa 
PuBLicaTions RECEIVED. x A ‘ 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


The Editors welcome the submission of manuscripts for possible publication. 
They should be typewritten entirely double-spaced, including footnotes, and 
two copies should be sent to the Editor, W. Allen Wallis, 207 Haskell Hall, 
University of Chicago, Chicago 37. Books for review should be sent to the same 
address. Unsolicited book reviews are not accepted, but suggestions of titles for 
review are welcome. 


EDITOR 


W. ALLEN Watts, University of Chicago 
ASSISTANT TO THE Eprror: MarGarRET A. LABADIE 


ASSOCIATE EDITORS 


Haroup A. FREEMAN Purp J. McCartuy 
Massachusetts Institute of Tech. Cornell University 
GrorGce M. Kuznets I. RicHarp SAVAGE 
University of California National Bureau of Standards 
C. ASHLEY WRIGHT 
Standard Oil Company (N.J.) 


ADVISORY PANEL OF FORMER EDITORS 


Wituiam G. Cocuran (1945-50) Frank A. Ross (1926-34, 41-45) 
Johns Hopkins University Thetford, Vermont 

WiuuiaM F. Oceurn (1920-1925) Freperick F. StrerHan (1935-40) 
University of Chicago Princeton University 


Errata: Readers and authors are urged to submit to the Editor notices of 
errors found in this or any previous issue. These will be published once 
a year, in the December issue. 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


Number 264 DECEMBER 1953 Volume 48 


STATISTICAL PROBLEMS OF THE KINSEY REPORT* 


Wriuuiam G. Cocuran, Johns Hopkins University 
Frepericx Moste.uer, Harvard University 
Joun W. Tukey, Princeton University 


HIs is the report of a committee appointed by the Commission on 
Statistical Standards of the American Statistical Association to 
review the statistical methods used in Serual Behavior in the Human 
Male. We shall refer both to the book and to its authors (Kinsey, 
Pomeroy and Martin) as KPM. The committee wishes to emphasize 
that this report is confined to statistical methodology, and does not 
concern itself with the appropriateness or the limitations of orgasm 
as & measure of sexual behavior. The treatment of specific problems 
has necessitated an examination of some of the statistical and method- 
ological problems of such studies, and the organization of frames of 
reference in which the statistical methods can be discussed. The com- 
mittee hopes that both detailed and general considerations will be of 
service to Dr. Alfred C. Kinsey and his co-workers; to the National 
Research Council’s Committee for Research on Problems of Sex, who 
requested the appointment of this committee; and to others facing 
similar statistical or methodological problems. 
We have endeavored to write this report in a way that would mini- 
mize the possibility of misunderstanding. To do this, it is necessary to 
* This article consists of the main text, but not the appendices, of the report of a committee ap- 
pointed in 1950 by S. S. Wilks as President of the American Statistical Association, to review the sta- 
tistical methods used by Alfred C. Kinsey, Wardell B. Pomeroy, and Clyde E. Martin in their Sexual 
Behavior in the Human Male (Philadelphia, W. B. Saunders Co., 1948). For further details on the ap- 
pointment of the committee and its charge, see Section 1, p. 676 below. For an outline of the appendices, 
as well as of this paper, see Section 3, pp. 678-81. Appendix G, “Principles of Sampling,” will appear 


as an article in the March issue of this JourRNau. The full report, including both the text given here 
and the appendices, will be published as a monograph by the American Statistical Association in 1954. 


673 








674 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


deal with many detailed aspects of the work, one at a time. By judicious 
selection of topics and attitudes, it would have been possible to write 
two factually correct reports, one of which would leave the impression 
with the reader that KPM’s work was of the highest quality, the other 
that the work was of poor quality and that the major issues were 
evaded. We have not written either of these extreme reports. 

Even within the present report, a reader who is trying only to sup- 
port his own opinions could select sections and topics to buttress either 
view. In the details of this report the reader will find numerous prob- 
lems that we feel KPM handled admirably. If he pays attention only 
to these, he would find support for the opinion that the work is nearly 
impeccable and that the conclusions must be subtantially correct. 
There are other problems which we believe KPM failed to handle ade- 
quately, in some cases because they did not devote the necessary skill 
and resources to the problems, in other cases because no solutions for 
the problems exist at present. The reader who concentrates only on the 
parts of our report in which such problems are discussed would find 
support for the opinion that KPM’s work is of poor quality. 

Our own opinion is that KPM are engaged in a complex program of 
research involving many problems of measurement and sampling, for 
some of which there appear at the present to be no satisfactory solu- 
tions. While much remains to be done, our overall impression of their 
work to date is favorable. 

Many details are discussed in the body and appendices of this report. 
The main conclusions are as follows: 

1. The statistical and methodological aspects of KPM’s work are 
outstanding in comparison with other leading sex studies. In a com- 
parison with nine other leading sex studies (four supported in part 
by the same NRC Committee) KPM were superior to all others in 
the systematic coverage of their material, in the number of items which 
they covered, in the composition of their sample as regards its age, 
educational, religious, rural-urban, occupational, and geographic repre- 
sentation, in the number and variety of methodological checks which 
they employed, and in their statistical analyses. So far as we can judge 
from our present knowledge, or from the critical evaluations of a num- 
ber of other qualified specialists, their interviewing was of the best. 

2. KPM’s interpretations were based in part on tabulated and statis- 
tically analyzed data, and in part on data and experience which were 
not presented because of their nature or because of the limitations of 
space. Some interpretations appear not to have been based on either 
of these. We feel that unsubstantiated assertions are not in themselves 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 675 


inappropriate in a scientific study. The accumulated insight of an 
experienced worker frequently merits recording when no documenta- 
tion can be given. However, KPM should have indicated which of their 
statements were undocumented or undocumentable and should have 
been more cautious in boldly drawing highly precise conclusions from 
their limited sample. 

3. Many of KPM’s findings are subject to question because of a 
possible bias in the constitution of the sample. This is not a criticism 
of their work (although it is a criticism of some of their interpretations). 
No previous sex study of a broad human population known to us, medi- 
cal, psychiatric, psychological, or sociological, has been able to avoid 
this difficulty, and we believe that KPM could not have avoided the 
use of a nonprobability sample at the start of their work. Something 
may now perhaps be done to study and reduce this possible bias, by a 
probability sampling program. 

In our opinion, no sex study of a broad human population can expect 
to present incidence data for reported behavior that are known to be 
correct to within a few percentage points. Even with the best available 
sampling techniques, there will be a certain percentage of the popula- 
tion who refuse to give histories. If the percentage of refusals is 10 
per cent or more, then however large the sample, there are no statistical 
principles which guarantee that the results are correct to within 2 or 
3 per cent. The results may actually be correct to within 2 or 3 per cent, 
but any claim that this is true must be based on the undocumented 
opinion that the behavior of those who refuse to be interviewed is not 
very different from that of those who are interviewed. These comments, 
which are not a criticism of KPM’s research, emphasize the difficulty 
of answering the question: “How accurate are the results?”, which is 
naturally of great interest to any user of the results of a sex study. 

4. Many of KPM’s findings are subject to question because of possi- 
ble inaccuracies of memory and report, as are all studies of intimate 
human behavior among broad segments of the population. No one has 
proposed any way to remove the dangers of recall (involving both 
memory and report) and KPM were superior to the nine studies re- 
ferred to above in their attempts to control and measure these dangers. 
We have suggested still further expansions of their methodological 
checks. 

Until new methods are found, we believe that no sex study of inci- 
dence or frequency in large human populations can hope to measure 
anything but reported behavior. It may be possible to obtain observed 
or recorded behavior for certain special groups, but no suggestions have 





676 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


been made by KPM, the critics, or this committee which would make 
it feasible to study observed or recorded behavior for a large human 
population. These remarks are intended as a comment on the present 
status of research techniques in sex studies and not as a criticism of 
KPM’s work. 

5. KPM received only limited statistical help, in part because the 
work was pursued during the War years when such expert help was 
difficult to find for non-military projects. In view of the limited statis- 
tical knowledge which was available to them, as made clear by the 
failure of their sample size experiment, KPM deserve much credit 
for the straight thinking which brought them safely by many pitfalls, 
Their need of adequate statistical assistance continues to be serious, 
Substantial assistance might come through the development of a 
statistical clinic at Indiana University, or through the addition of a 
statistical expert to KPM’s own staff. Unfortunately the sort of assist- 
ance which might resolve some of their most complex problems would 
require understanding, background, and techniques that perhaps not 
more than twenty statisticians in the world pessess. 

6. A probability sampling program should ‘be seriously considered 
by KPM. The actual gains from an extensive program are limited, to 
an extent unknown at present, by refusal rates and indirectly by costs, 
_ particularly by the costs of maintaining the present quality of the indi- 
vidual histories by KPM’s approach. A step-by-step program, starting 
with a very small pilot study, is recommrended. 

7. In addition to proposing a probability sampling program, we 
have made numerous suggestions in this report for the modification 
and strengthening of KPM’s present approach. The suggestions in- 
clude expanded methodological checks of their sampling program, a 
further study of their refusal rate, some modification of their methods 
of analyses, further comparisons of reported vs. observed behavior, 
and stricter interpretations of their data. We have been informed by 
KPM that many of these improvements, including some expansion 
of their techniques for obtaining data, have already been incorporated 
in the volume dealing with sexual behavior in the human female. 


CHAPTER I. BACKGROUND AND ORGANIZATION 
1. Organization involved 


This committee, consisting of William G. Cochran, Chairman, 
Frederick Mosteller, and John W. Tukey, was appointed by President 
S. 8. Wilks in September 1950 as a committee of the Commission on 





gTATISTICAL PROBLEMS OF THE KINSEY REPORT 677 


Statistical Standards of the American Statistical Association. This 
action was initiated by a request from the Committee for Research 
on Problems of Sex of the National Research Council, as indicated by 
the following excerpt from a letter dated May 5, 1950, from Dr. George 
W. Corner, a member of the NRC Committee, to Dr. Isador Lubin, 
Chairman of the Commission on Statistical Standards of the American 
Statistical Association. 

“In accordance with our telephone conversation of yesterday, I am writing 
to state to you the desire of the Committee for Research in Problems of Sex, 
of the National Research Council, that the Commission on Standards of the 
American Statistical Association will provide counsel regarding the research 
methods of the Institute for Sex Research of Indiana University, led by 
Dr. Alfred C. Kinsey. 

“This Committee has been the major source of financial support of Dr. 
Kinsey’s work, and at its annual meeting on April 27, 1950, again renewed 
the expression of its confidence in the importance and quality of the work 
by voting a very substantial grant for the next year. 

“Recognizing however that there has been some questioning, in recently 
published articles, of the validity of the statistical analysis of the results 
of this investigation, the Committee, as well as Dr. Kinsey’s group, is 
anxious to secure helpful evaluation and advice in order that the second 
volume of the report, now in preparation, may secure unquestioned ac- 
ceptance.” 


Some correspondence ensued, in which Wilks indicated the willing- 
ness of the American Statistical Association to provide counsel as 
requested. 

Kinsey, in a letter to Wilks dated August 28, stated that 

“we should make it clear that we deeply appreciate the willingness of the 
American Statistical Association to undertake such an examination of our 
statistical methods, that we will give it full cooperation in having access 
to all of our data as far as the peculiar confidential nature of our data will 
allow, and that we understand, of course, that the committee shall be free to 
publish its findings of whatever sort.” 


In the same letter, Kinsey also made a number of suggestions about 
the constitution and work of the committee, to the effect that the 
persons on the committee should be primarily statisticians with experi- 
ence in human population studies, that they should plan to review 
the statistical criticisms which have been published about the book on 
the male, and that they should compare methods used by Kinsey and 
his associates in their research with methods in other published research 
in similar fields. 

With respect to the research on the human female, Kinsey wrote as 
follows: 





678 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


“It should, however, be made clear that all the data that will go into our 
volume on Sexual Behavior in the Human Female are already gathered, that 
the punch cards have already been set up and most of them punched, and 
that statistical work is proceeding on that volume now. While the recom- 
mendations of the committee may modify further work, it can affect this 
forthcoming volume only in the form in which the material is presented, the 
limitations of the conclusions, and the careful description of the limitations 
of our method and conclusions.” 


2. Committee procedure 


Although no specific written directive was issued to the committee, 
the letter quoted earlier from Corner to Lubin sets forth the task as- 
signed to the committee. In one respect the scope was deliberately 
reduced as compared with that envisaged in the letter. The committee 
decided not to undertake any examination of the researches and data 
relating to the human female, in order to avoid disruption of Kinsey’s 
proposed schedule of work. 

In October, 1950, the committee spent five days at the Institute for 
Sex Research of Indiana University, accompanied by Mr. Robert 
Osborn as assistant. Subsequent meetings of the committee were held 
at Chicago (December 1950), Princeton (January 1951), Cambridge 
(May 1951), Baltimore (July 1951) and Princeton (October 1951). 

In their review of previous studies of sexual behavior, the committee 
received major assistance from Dr. W. O. Jenkins, who prepared a 
series of reports which appear in Appendix B. Mr. A. Kimball Romney 
prepared a helpful index of the principal criticisms made of the statisti- 
cal methodology used in the book Sexual Behavior in the Human 
Male. 


3. Structure of this report as a whole 


KPM’s program of research is a major undertaking, involving more 
than ten years’ work. Any discussion of it which aims at thoroughness 
must itself be lengthy. In order to keep the main body of our report 
down to a reasonable length, we have relegated much of the documen- 
tation of our conclusions, and all detailed discussion, to the following 
series of appendices. 


. Discussion of comments by selected technical reviewers. 
. Comparison with other studies. 

. Proposed further work. 

. Probability sampling considerations. 

. The interview and the office as we saw them. 

. Desirable accuracies. 

. Principles of sampling. 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 679 


Appendix A contains our discussion of the statistical and quantita- 
tive methodological content of six of the critical reviews which ap- 
peared after the publication of the KPM book. These six were chosen 
from among the large number of published reviews, because they 
concentrated their attention on the statistical aspects of the research. 
Appendix A also includes, where this seems appropriate, discussion of 
some critical points which were not explicitly raised in the reviews in 
question. 

Appendix B, by W. O. Jenkins, contains a review of the statistical 
aspects of eight of the major previous sex studies which have been car- 
ried out in the United States. Also included are similar reviews of the 
KPM book and of one more recent study by J. E. Farris. The purpose 
of this appendix is to provide a basis for comparing the KPM study 
with the other studies as to comprehensiveness, sampling methods, 
interviewing methods and statistical analysis. 

Appendix C begins by outlining and commenting on suggestions for 
further work made by the reviewers. It explains the difficulty of esti- 
mating the stability of results from a sampling procedure such as 
KPM’s, offers some possible methods for this estimation, and suggests 
how more appropriate variables for expressing sexual behavior might 
be developed, and how compound variables might be built on these. 
It then explores the problem of when to adjust, giving a simple numeri- 
cal procedure for making the decision, and concludes by summarizing 
the probability sampling suggestions derived from Appendix D. 

Appendix D discusses the problems of analysis and usefulness of 
probability sampling as a check on a nonprobability sample, particu- 
larly when refusal rates are considered; two possible types of probabil- 
ity samples and a probability sampling program which KPM might 
undertake; and the alternative of studying restricted populations. 

Appendix E discusses the interview and the office as we saw them. 
Appendix F discusses what seems to be known about the accuracy 
needed in such work as KPM’s. Appendix G presents an account of 
the principles of sampling illustrated with general examples. 

Many of the problems faced by KPM occur in most types of soci- 
ological investigation. Some are likely to be encountered in almost any 
kind of scientific investigation. For this reason, we have thought it 
advisable to present certain of the methodological issues in rather 
general terms. 

The reader is asked to bear in mind that in general our conclusions 
are not documented in the main body of the report, but in the appen- 
dices to which references are given. 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


CONTENTS OF FULL REPORT 
Title Sections Page 
STATISTICAL PROBLEMS OF THE KINSEY REPORT 


BACKGROUND AND ORGANIZATION 676 
Maysor AREAS OF CHOICE 684 
PRINCIPLES OF SAMPLING 688 
Tue INTERVIEW AREA 693 
THE SAMPLING AREA 695 
METHODOLOGICAL CHECKS 697 
ANALYTICAL TECHNIQUES 699 
Two CompLex ANALYSES 703 
CARE IN INTERPRETATION 706 
CoMPARISON WITH OTHER STUDIES 708 
CONCLUSIONS 711 
SUGGESTED EXTENSIONS 713 


Appenpix A: Discussion oF COMMENTS BY SELECTED 
TECHNICAL REVIEWERS 


Tue SAMPLING PRoBLEM 

THE INTERVIEWING PROBLEM.............+00-: 7-11 
Tue Report PROBLEM 12-16 
THE STABILITY PROBLEM 17-21 
THE, MEASUREMENT PROBLEM 

STATISTICAL TECHNIQUES 

PRESENTATION OF DaTA 

INTERPRETATION 

SuMMARY 


ApPpENDIX B: CoMPARISON WITH OTHER STUDIES 


ApprEnpIx C: Proposep FurtHER WorK 


SUGGESTIONS BY CRITICS 

MEASURING STABILITY 

DEVELOPMENT OF APPROPRIATE VARIABLES 
DEVELOPMENT OF COMPOSITE VARIABLES 





sTATISTICAL PROBLEMS OF THE KINSEY REPORT 


Chapter Title Sections 


V-C DETERMINATION OF OpTimuM FINENESS OF AD- 
JUSTMENT 
VI-C Proposep PrRoBaBILITyY SAMPLING 
VII-C Summary 


ApprenpIx D: PrRoBABILITY SAMPLING CONSIDERATIONS 


I-D GatNns FroM PuRE PROBABILITY SAMPLING 
II-D CoMmPpaRISON oF SAMPLES 
IlI-D Rerusau Rates AND SAMPLE S1zEs 
IV-D LocaTING THE SAMPLE 
V-D A Specrau ProBaBi.Lity SAMPLE 
VI-D A HovsEHOLD PROBABILITY SAMPLE 
VII-D Costs anp A Pitot Stupy 
VIII-D A ProBasBILity SAMPLING PRoGRAM 
IX-D ReEstrRIcTED PoPpuULATIONS 
X-D SuMMARY 


AppEnpDIx E: Tue INTERVIEW AND THE OFFICE AS WE Saw THEM 


I-E Tue INTERVIEW 
Il-E Tue Puant 


AppENDIX F: DEsIRABLE ACCURACY 
I-F DerstrRaBLE ACCURACY 
AppEenpIx G: PRINCIPLES OF SAMPLING 


I-G SampuLeEs AND THEIR ANALYSIS 
II-G Systematic Errors 


4. Structure of the main body 


In preparing the main body, we have stressed easy reference and 
have kept related matters together at the expense of fluency of arrange- 
ment and lack of repetition. Thus our main conclusions in a form in- 
tended for the general reader take 3 pages in the digest above, while 
more detailed conclusions, expressed for a more technical audience, take 
3 pages in Chapter XI. A particular subject summarized there is also 
likely to be discussed once in Chapter II, where we try to point out 
what KPM did, once again in one of Chapters IV to IX, where we 
assess KPM on an absolute scale, and yet again in Chapter X, where 
we compare KPM with previous workers in the field. This is repetitive, 
but we hope that it will permit ready reference and avoid treating 
subjects out of context. 

After this introductory chapter on background structure, the re- 
mainder of the main body falls into three parts: 





682 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


(i) Chapters II and III. In the first of these, we describe, respectively, 
what choices KPM had to make and what they chose. In Chapter II] 
we outline some essential principles of sampling, which seem not to 
have been clearly enough formulated or widely enough understood, 
These chapters are introductory. 

(ii) Chapters IV to XI. In the first six of these, we try to compare 
KPM’s work with an absolute standard. The order chosen (interview, 
- sample, methodological checks, analytical techniques, complex exam- 
ples, interpretation) is that in which the problems arise in an evolving 
study such as KPM’s. Chapter X compares KPM with previous works 
on the basis of Appendix B, while Chapter XI summarizes the conclu- 
sions of this part. 

(iii) Chapter XII. This discusses briefly various suggested expendi- 
tures of further effort. 


CONTENTS 


Cuap. I. BACKGROUND AND ORGANIZATION 


. Organizations involved 

. Committee procedure 

. Structure of this report as a whole 
. Structure of the main body 


Cuap. II. Mayor AREAS OF CHOICE 


. What sort of behavior? 

. Whose behavior? 

. Observed, recorded or reported behavior 

. Interview or questionnaire, and types thereof 
. Which subjects? 


. Introduction 

. Cluster sampling 

. Possibilities of adjustment 

. Probability samples 

. Nonprobability samples 

. Sampled population and target population 


Cuap. IV. Tue INTERVIEW AREA 


. Interview vs. questionnaire 
. Interviewing technique 





ee 62495 649 85 85 8D BD BOD 


sTATISTICAL PROBLEMS OF THE KINSEY REPORT 


Cuap. V. THE SAMPLING AREA 


. KPM’s sampled population 
- Could KPM have used probability sampling?..................... 


Cuap. VI. METHODOLOGICAL CHECKS 


. Possible checks 
. KPM’s checks 


Cuap. VII. ANALYTICAL TECHNIQUES 


. Variables affecting sexual behavior 

. Definition of the variables 

. Assessing effects of variables 

. The measurement of activity 

. Tests of significance 

a a I...) 3'¢.s6, asc 10s) odie laiw & grace Jim Sate al ele meet ate ee ee 
. The accumulative incidence curve 

. Other devices 


Cuap. VIII. Two Compiex ANALYSES 


. Patterns in successive generations. ................ccc cece eeeeees 
. Statistical methods 

Pe er GE GONE GIB. 5 oa cs cccccccsctvncdcncecsnacces 
. Val dity of inferences 

. Vertical mobility 


Cuap. IX. Care IN INTERPRETATION 


. Sample and sampled population 

. Sampled population and target population 
. Systematic errors of measurement 

. Unsupported assertions 

. The major controversial findings 


Cuap. X. COMPARISON WITH OTHER STUDIES 


. Sampling 
. Analysis 
. Interpretation 


. Sampling 

. Analysis 

. Interpretation 

S nee WUE UEP, 6. soe cececcnssssbacsinngeas 
. The major controversial findings 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 193 


Cuap. XII. Suacestep Extensions 


. Probability sampling 
. Retakes 

. Spouses 

. Presentation 

. Statistical analyses 

. Relative priorities 


CHAPTER II. MAJOR AREAS OF CHOICE 
5. What sort of behavior? 


The purpose of Chapter II is to record in summary form the major 
choices made by KPM. 

Certainly the choice of orgasm as the central sort of sexual behavior 
for study was a major one, leading to consequences whose statistical 
aspects will be discussed in various places, but this choice is not a mat- 
ter of general quantitative methodology, and hence falls outside the 
scope of this committee’s task. 


6. Whose behavior? 


KPM had to choose the population to which this study should apply. 
This decision does not seem to have been made clearly. From the basis 
for the “U. S. Corrections” (p. 105) we should infer it to be “all U. 8. 
white males.” If it were the population to which the U. 8S. Corrected 
sample actually applies on the average (the sampled population, see 
Section 18), it would be a rather odd white male U. S. Population. 
It would have age groups, educational status, rural-urban background, 
marital status and all their combinations according to the 1940 census, 
but it would have more members in Indiana than in any other state, 
and it would have been selected to an unknown degree for willingness 
to volunteer histories of sexual behavior. We do not regard this descrip- 
tion of the sampled population as an automatic criticism, as some crit- 
ics do. We make it here as a factual statement, noting that the careful 
and wise choice of the sampled population, although difficult, is a rela- 
tively free choice of the investigator. More discussion relevant to this 
point will be found in Chapter II-G (Appendix G). 

Further, KPM chose to study the behavior of many (at least 163 
in tabular form) segments of this large population, feeling, apparently, 
both that comparisons among segments would be illuminating and 
that data for (clinical) application to individuals should come from 4 
reasonably homogeneous segment. KPM’s choice of a broad population 





najor 


AVior 
tical 
mat- 
2 the 


sTATISTICAL PROBLEMS OF THE KINSEY REPORT 685 


created many problems, particularly in sampling. Whether they would 
have been well advised to confine themselves to a more restricted popu- 
lation, e.g., the state of Indiana, is debatable. For our part, we are willing 
to take their choice as given, and to discuss briefly elsewhere some alter- 
natives for further work (Chapter IX-D). 


7. Observed, recorded, or reported behavior 


KPM, interested in actual behavior, had, in principle, the choice 
of studying observed, recorded, or reported behavior. But since they 
selected a broad population and orgasm as the type of behavior, their 
only feasible choice seems to have been reported behavior. This situa- 
tion does not seem likely to change in the foreseeable future. 

The choice of reported behavior implies that the question: ‘On the 
average, how much difference is there between present reported and 
past actual behavior?” is seriously involved in any inferences about 
actual behavior which are attempted from KPM’s results. The differ- 
ence might well be large, leading to a large systematic error in measure- 
ment. However, use of observed or recorded behavior in order to avoid 
this difference does not seem to us a feasible way to measure nation- 
wide incidences and frequencies for KPM’s broad population, because 
it would have produced systematic errors in sampling possibly larger 
than the error in measurement. 


8. Interview or questionnaire, and types thereof 


Having settled on reported behavior, KPM had to decide whether 
this report should be oral or written, and what methods should be 
used to elicit it. Their choice was oral, in a face-to-face interview 
whose flavor was designed to be that of a doctor or family friend. 
The choice of oral rather than written report: 


(1) made it possible to obtain apparently satisfactory answers from 
many more subjects (the percentage of complete illiteracy in the 
U. S. is small, but the percentage of illiteracy on complex sub- 
jects not usually written about is undoubtedly substantial). 

(2) permitted and encouraged variation of the form of the questions 
to suit the subject and the situation. 


Those, like some critics, who believe in a repeatable measurement 
process, regardless of whether or not it measures something that is 
always relevant, find (2) bad. Those who, like KPM, feel that appro- 





686 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


priately flexible wording improves communication and thus improves 
the quality of report despite the variability resulting from changes 
in the form of questions, find (2) good. 

Given an interview rather than a questionnaire, the remaining 
choices of KPM follow a consistent pattern. In nearly every cage 
their approach resembled the clinical interview more closely than the 
psychometric test. 


9. Which subjects? 
Here there are various choices, pertaining to: 


(1) selection of individuals one at a time or in clusters. 

(2) keeping age, education, marital status, etc., segments in the sam- 
ple proportionate to those in the population or making them of 
more nearly equal size. 

(3) selecting individuals on a catch-as-catch-can basis, a partly ran- 
domized basis, or according to a probability sampling plan. 


They chose: 


(1) to select individuals in clusters. 
(2) to keep age, education, marital status, etc., segments more nearly 
equal in the sample than in the population. 


(3) to use no detectable semblance of probability sampling ideas. 


The pros and cons will be discussed later. 


10. What methodological checks? 


There are choices as to the types of checks and the number of each 
to be made. The types of checks made by KPM, including 


(1) take-retake, 

(2) husband-wife, 

(3) duplicate recording of interview, 
(4) overall comparison of interviews, 
(5) others (see Chapter V-A) 


seem to cover all those easily thought of. The numbers of checks made 
are discussed later. Duplicate recording of interviews occurred in an 
unknown, but presumably small, number of cases. No comparisons 
from duplicate recordings were reported, perhaps because most oc- 
curred in connection with the training of interviewers. 


~~ tte co Cd cad ae a le 





gTATISTICAL PROBLEMS OF THE KINSEY REPORT 687 


11. How analyzed and presented? 


In analyzing frequency and incidence of activity, KPM chose to 
report both raw and “U.S. Corrected” data and to make simple com- 
parisons. Just what was done in general was clearly stated, but the 
steps involved in detailed computations were not explained. No at- 
tempt was made to find helpful scales or composite variables (see 
Chapters IV-C and V-C). 

With the exception of “U. S. Corrections,” most of the analysis of 
the tabular data is confined to straightforward description. Some at- 
tention is paid to the problem of sample-population relation in the form 
of standard errors (presumably underestimated because they were 
based on the assumption of random sampling). However, this ap- 
proaches lip service, since many apparent differences are discussed 
with no attention to significance or nonsignificance. (Again we do not 
regard this as an automatic criticism, particularly since accurate indi- 
cation of significance would have been difficult—see Section A-18.) 

In analyzing cumulative activity, KPM’s main tool was the accumu- 
lative incidence curve, a technique which they developed independ- 
ently. 


12. How interpreted? 
The main choices concerned 


(1) extent of warning about possible differences between reported 
behavior and actual behavior, 

(2) extent of warning about possible differences between the sam- 
pled population (see Section 18) and the entire U.S. white male 
population, 

(3) extent of warning about sampling fluctuations, 

(4) extent of verbal discussion not based on evidence presented, 

(5) certainty with which conclusions were presented. 


Under (1) the emphasis was on methodological checks in order to indi- 
cate, as far as they could, how small this difference seemed to KPM 
to be. Under (2) there was little discussion. Under (3) the warnings 
were made early, incompletely, but not often. Under (4) the extent 
of discussion was substantial, most of it aimed at social and legal atti- 
tudes about sexual behavior, and descriptions or practices not covered 
by the tables. Under (5) the conclusions were usually presented with an 
air of solid certainty. 

In general the observations seem to have been interpreted with more 
fervor than caution, although occasional qualifications may be found. 





688 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


CHAPTER III. PRINCIPLES OF SAMPLING 
13. Introduction 

It is difficult, if not impossible, to assess the quality of any sample 
and its analysis without comparing it with a set of principles. This is 
particularly true of KPM’s works. The present chapter endeavors to 
set down, in compact form, a few of the principles of sampling which 
are especially relevant to a consideration of KPM’s sampling. As we 
have noted (Section 6), KPM chose to select individuals in groups or 
clusters, to divide the population into segments and keep segment 
sizes more nearly equal in the sample than in the population, and to 
use no semblance of probability sampling ideas. The discussion in this 
chapter concentrates on these aspects of sampling. 

Many readers will, we believe, desire a more connected account of 
the principles of sampling, with examples and fuller discussion. These 
are provided in Appendix G. Any reader who finds the statements 
used in this chapter unclear, or not intuitively acceptable, is urged to 
turn to Appendix G before proceeding further. Once there, he should 
read through from the beginning, since argument and exposition there 
are closely knit and unsuited to piecemeal references. 

Whether by biologists, sociologists, engineers, or chemists, sampling 
is often taken too lightly. In the early years of the present century, it 
was not uncommon to measure the claws and carapaces of 1000 crabs, 
or to count the number of veins in each of 1000 leaves, and to attach 
to the results the “probable error” which would have been appropriate 
had the 1000 crabs or the 1000 leaves been drawn at random from the 
population of interest. If the population of interest were all crabs in a 
wide-spread species, it would be obviously almost impossible to take 
a simple random sample. But this does not bar us from honestly assess- 
ing the likely range of fluctuation of the result. Much effort has been 
applied in recent years, particularly in sampling human populations, 
to the development of sampling plans which, simultaneously, 


(i) are economically feasible, 
(ii) give reasonably precise results, and 
(iii) show within themselves an honest measure of fluctuation of 
their results 


Any excuse for the practice of treating non-random samples as random 
ones is now entirely tenuous. Wider knowledge of the principles involved 
is needed if scientific investigations involving samples (and what such 
investigation does not involve samples?) are to be solidly based. 
Additional knowledge of techniques is not so vitally important, though 
it can lead to substantial economic gains. 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 689 


14. Cluster sampling 


A botanist who gathered 10 oak leaves from each of 100 oak trees 
might feel that he had a fine sample of 1000, and that, if 500 were 
infected with a certain species of parasites, he had shown that the 
percentage infection was close to 50%. If he had studied the binomial 
distribution, he might calculate a standard error according to the usual 
formula for random samples, p++/pq/n, which in this case yields 
50+1.6% (since p=q=.5 and n= 1000). In doing this he would neglect 
three things: 


(i) probable selectivity in selecting trees (favoring large trees, per- 
haps? 

(ii) probable selectivity in choosing leaves from a selected tree (fav- 
oring well-colored or alternatively, visibly infected leaves per- 
haps and 

(iii) the necessary allowance, in the formula used to compute the 
standard error, for the fact that he had not selected his leaves 
individually. 


Most scientists are keenly aware of the analogs of (i) and (ii) in their 
own fields of work, at least as soon as they are pointed out to them. 
Far fewer seem to realize that, even if the trees were selected at ran- 
dom from the forest, and 10 leaves were chosen at random from each 
selected tree, (iii) must still be considered. But if, as might indeed be 
the case, each tree were either wholly infected or wholly free of infec- 
tion, then the 1000 leaves tell us no more than 100 leaves, one from 
each tree, since each group of 10 leaves will be all infected or all free 
of infection. In this event, we should take n=100 in calculating the 
standard error and find an infection rate of 50+5%. Such an extreme 
case of increased fluctuation due to sampling in groups or clusters 
would be detected by almost all scientists, and is not a serious danger. 
But less extreme cases easily escape detection. 

We have just described, as one example of the reasons why the 
principles of sampling need wider understanding, an example of 
cluster sampling, where the individuals or sampling units are not 
drawn separately and independently into the sample, but are drawn 
in clusters, and have tried to make it clear that “individually at ran- 
dom” formulas do not apply. Cluster sampling is often desirable, but 
must be analyzed appropriately. KPM’s sample was, in the main, 
a cluster sample, since they built up their sample from groups of people 
rather than from individuals. 





690 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 
15. Possibilities of adjustment 


Often the population is divided into segments of known relative size, 
perhaps from a census. It is sometimes thought that the best method 
of sampling is to take the same proportion from every segment, so 
that the sample sizes in the segments match the corresponding popula- 
tion sizes. Such samples do have the advantage of simplifying computa- 
tions by equalizing weights, and they sometimes lead to a reduction 
of sampling error. But modern sampling theory shows that optimum 
allocation of resources usually requires different proportions to be 
sampled from different segments, whether the purpose is to estimate 
average values over the population or to make analytical comparisons 
between results in one group of segments and those in another. 

When there are disparities in the relative sizes of segments in the 
sample as compared with the population, whether accidental or 
planned, these disparities must be taken into account when we at- 
tempt to estimate averages over the whole population. One way in 
which this can be done is by adjustments applied to the segments. Such 
adjustments proceed as follows. Suppose that we know 


(i) the true fraction of the population in each segment, and 
(ii) the segment into which each individual in the sample falls. 


Then we can weight each individual in the sample by the ratio 


fraction of population in that segment 





fraction of sample in that segment 


(It is computationally convenient to weight each segment mean with 
the numerator of this ratio; the result is algebraically identical to that 
described above.) 

The result of adjustment is a new “sampled population”—one such 
that the relative sizes of its various segments are very nearly correct 
(according to (i) above). Since the weight is the same for all the sample 
individuals in a given segment, adjustment does nothing to redress 
any selectivity which may be present within segments. If we adjust 
in this way, we remove one source of systematic error without affecting 
other sources at all. The philosophy of such adjustments is discussed 
further in Section G-12, and it is concluded that they may generally 
be appropriately made (within the limits discussed in sections C-16 
—C-18). Their chief danger is the possible neglect of the possibilities 
that they may be 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 
(i) entirely too small, 
(ii) too large, 
(iii) in the wrong, direction, 
because of unredressed selectivity within the segments. When this pos- 


sibility exists, extreme caution in presenting the results of adjustment 
is indicated. 


16. Probability samples 


When probability samples are used, inferences to the population 
can be based entirely on statistical principles rather than subject- 
matter judgment. Moreover, the reliability of the inferences can be 
judged quantitatively. A probability sample is one in which 


(i) each individual (or primary unit) in the sampled population has 
a known probability of entering the sample, 

(ii) the sample is chosen by a process involving one or more steps of 
automatic randomization consistent with these probabilities, 
and 

(iii) in the analysis of the sample, weights appropriate to the proba- 
bilities (i) are used. 


Contrary to some opinions, it is not necessary, and in fact usually not 
advisable in a pure probability sample for 


(i) all samples to be equally probable, or 
(ii) the appearance of one individual in the sample to be unrelated 
to the appearance of another. 


In practice, because some respondents cannot be found or are unco- 
operative, we usually obtain, at best, approximate probability samples 
(see Sections A-2 and D-13) and have approximate confidence in our 
inference. 


17. Nonprobability samples 


Samples which are not even approximately probability samples 
vary widely in both actual and apparent trustworthiness. Their trust- 
worthiness usually increases as they are insulated more and more 
thoroughly from selective factors which might be related to the quanti- 
ties being studied. Insulation may be obtained by: 


(i) adjustments applied to the segment means in the sample, 
(ii) examination of the sample as drawn for signs of selection on a 
particular factor, 
(iii) partial randomization. 





692 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


Adjustment for segments, as explained in Section 15 above, corrects 
for any selective factor operation between segments, but corrects not 
at all for selective factors operating within segments. If adjustment is 
to be used, deliberate selectivity between segments may be exercised 
without danger, so long as it does not imply selectivity within segments, 

Negative results when the sample is examined for signs for selection 
on a particular variable are comforting, and strengthen the reliability 
of the sample. The amount of this strengthening depends very much 
on the a priori importance of the variables checked to what is being 
studied. 

Deliberate (partial) randomization is a step toward a probability 
sample, and may be very helpful on occasion. 


18. Sampled population and target population 


We have found it helpful in our thinking to make a clear distinction 
between two population concepts. The target population is the popula- 
tion of interest, about which we wish to make inferences or draw 
conclusions. It is the population which we are trying to study. The 
sampled population requires a more careful definition but, speaking 
popularly, it is the population which we actually succeed in sampling. 

The notion of a sampled population can be more clearly described 
for probability sampling. In order to have probability sampling, we 
must know the chance that every sampling unit has of entering the 
sample, and the weight to be attached to the unit in the analysis. 
The sampled population may be defined as the population generated 
by repeated application of these chances and these weights. The fre- 
quency of occurrence of any particular sampling unit in the sampled 
population is proportional to the product 


(chance of entering the sample) X (weight used in analysis). 


This product is made constant for a probability sample. Thus, with 
probability sampling, the sampled population consists of all sampling 
units which have a non-zero chance of selection. 

The sampled population is an important concept because by statisti- 
cal theory we can make quantitative inferential statements, with known 
chances of error, from sample to sampled population. It must be 
carefully distinguished from the target population, the population of 
interest, about which we are tempted to make similar inferential state- 
ments. 

Even with probability sampling, the sampled and the target popula- 
tion usually differ because of the presence of “refusals,” “not-at- 





ER 1953 


rrects 
ts not 
ent ig 
Trcised 
ments, 
ection 
bility 
much 
being 


bility 


iction 
ypula- 
draw 
. The 
aking 
pling. 
ribed 
Zz, we 
g the 
lysis, 
rated 
e fre- 
apled 


! sTATISTICAL PROBLEMS OF THE KINSEY REPORT 693 


homes,” “unable to classify,” and so on. The consequence of these 
disturbances is that certain sampling units, although assigned a known 
chance of selection by the sampling plan, did not in fact have this 
chance in practice. 

With non-probability sampling, the situation is much more obscure. 
By its definition as given above, the sampled population depends on 
the existence of a sampling plan (which may be only a vague set of 
principles in the investigator’s head) and on the “chances” that any 
sampling unit had of being drawn. These chances are not well known— 
if they were, we should have a probability sample. But in many cases, 
it is reasonable to behave as if these chances exist and to attempt to 
estimate them, because they provide the only means of making statis- 
tical inferences beyond the non-probability sample to a corresponding 
“sampled population.” The difficulty comes in specifying, or some- 
times even thinking about, the nature of the sampled population. It is 
certain to be a weighted population where, for example, Theodosius 
Linklater may appear 1.37 times, while Basil Svensson appears only 
0.17 times. 

Insofar as we make statistical inferences beyond the sample to a 
larger body of individuals, we make them to the sampled population. 
The step from sampled population to target population is based on 
subject-matter knowledge and skill, general information, and intuition 
—but not on statistical methodology. 


CHAPTER IV. THE INTERVIEW AREA 


19. Interview vs. questionnaire 

The committee members do not profess authoritative knowledge 
of interviewing techniques. Nevertheless, the method by which the 
data were obtained cannot be regarded as outside the scope of the 
statistical aspects of the research. 

For what our opinion is worth, we agree with KPM that a written 
questionnaire could not have replaced the interview for the broad 
population contemplated in this study. The questionnaire would not 
allow flexibility which seems to us necessary in the use of language, in 
varying the order of questions, in assisting the respondent, in following 
up particular topics and in dealing with persons of varying degrees of 
literacy. This is not to imply that the anonymous questionnaire is 
inherently less accurate than the interview, or that it could not be 
used fruitfully with certain groups of respondents and certain topics. 
So far as we are aware, not enough information is available to reach a 
verdict on these points. 











694 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 
20. Interviewing technique 


Many investigators have faced the problem of attempting to obtain 
accurate information about facts which the respondent is thought to 
be unwilling to report. It is natural to inquire whether KPM, in their 
interviewing technique, took advantage of accumulated experience 
as to the best methods for extracting the facts. But it is also well to 
inquire how much definite experience has been accumulated. 

The KPM interview impressed us as an extraordinarily skillful per- 
formance. Direct questions are put rapidly in an order which seems to 
these respondents hard to predict, so that it is difficult to tell what is 
coming next. Despite the air of briskness, we did not receive the im- 
pression that we were being hurried if we wished to reflect before re- 
plying, and supplementary questions or information were given if this 
seemed helpful to the memory. The coded recording of the data was 
done unobtrusively by the interviewer, so that the interview appeared 
to be a friendly conversation rather than any kind of an inquisition. 
These, of course, are personal impressions. 

KPM evidently think highly of the virtues of this technique, because 
it was adopted despite limitations which it imposes on the scope and 
rate of progress of the study. The technique makes great demands on 
the interviewer. The long period of training and the personal qualities 
required have restricted and will continue to restrict the interviewers 
to a very small number. This limits the speed with which data can be 
accumulated and also puts restrictions on the type of sampling that 
can be employed. 

The type of interview used by KPM differs markedly from the 
less directive methods which are sometimes recommended for dealing 
with taboo subjects. If the subject is likely to feel that his answer to a 
certain question will affect his prestige in the eyes of the interviewer, 
a less directive approach would be to conduct the interview in such a 
way that he gives the desired information without realizing that he is 
answering the awkward question. The KPM method is the antithesis 
of this. Research on interviewing techniques has not yet produced any 
substantial body of evidence as to the superiority of either the less 
directive methods or the KPM technique. 

With regard to specific inaccuracies in the KPM data, we believe that 
the interview gives an opportunity both for positive and negative bias. 
The KPM assumption that everyone has engaged in all types of ac- 
tivity seems to some likely to encourage exaggeration by the respond- 
ents. (KPM feel (personal communication) that their cross-checks are 








R 1953 


otain 
ht to 
their 
lence 
ll to 


per- 
ns to 
at is 
: im- 
e re- 
' this 
was 
ared 
tion. 


ause 

and 
Is on 
‘ities 
wers 
n be 
that 


the 
ling 
toa 
wer, 
ch a 
he is 
1esis 
any 
less 


that 
Dias. 
 ac- 
ond- 
are 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 695 


highly effective in detecting such exaggeration.) On the other hand, 
our impression from the interview was that a successful denial of cer- 
tain types of activity would be possible if the subject was prepared to 
do so, although we do not know the full extent of the KPM cross- 
checks which would lead them to be suspicious of such a denial. KPM 
assert (personal communication) that they regard cover-up as @ more 
likely source of bias than exaggeration. Our opinions on this statement 
are divided. 

As KPM point out (p. 48), the subject’s willingness to talk about 
certain types of activity is influenced by the attitudes of the social 
group to which he belongs. Until evidence to the contrary is presented, 
the presumption (made by some of the critics) that his final responses 
will also be influenced is one that cannot be cast aside. The size of these 
influences is still a matter of opinion. A corresponding element of doubt 
is present in almost all comparisons between different social levels, 
both those which provide some of the most interesting comparisons in 
the book, and those in many other studies. 


CHAPTER V. THE SAMPLING AREA 
21. KPM’s sampled population 


As noted above, KPM’s sample was deliberately disproportionate, 
partly in order to cover individual segments defined by age, education, 
religion, etc., in an adequate manner, partly because of geographical 
convenience. If the results for individual segments were to be based 
on samples of at least moderate size, such disproportion was necessary 
and wise. Its effects on overall results are less clear. It seems impossible 
to be sure what effect it had on the variability of the final result, and 
its use is certainly not a demonstrable error as far as variability is 
concerned. 

In their U. S. corrections, KPM provided adjustments for dispro- 
portion between segments defined by age, education, and marital status. 
As noted above (Section 17) we feel that such adjustments are usually 
appropriate. Due to absence of population data, they did not adjust 
for religion. The geographical imbalance of their sample was so great 
that an overall geographic adjustment was not feasible. Thus they com- 
pensated for some disproportions, and left others to produce what 
effects they would. 

Their only examination of the sample for signs of selection within 
segments is their comparison of 100% groups (groups where all mem- 
bers were interviewed) with partial groups (groups where only part of 











696 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


the members were sampled). This gives some insight into the effect of 
volunteering as a selective factor. Beyond this, KPM report no serious 
effort to measure the actual effect of volunteering, or to discover what 
percentage of the population they would be able to persuade to be inter- 
viewed. 

They made no use of randomization. They might have attempted to 
sample, say, college seniors from two colleges drawn at random from a 
large list of colleges, but they are of the opinion (personal communica- 
tion) that this would have slowed up the work to an unmanageable 
extent. 

All in all, the absence of any orderly sampling plan contrasts strik- 
ingly with their usual methodical mode of attack on other problems. 

As stated briefly above (Section 6), the “sampled populations” 
corresponding to 


(1) KPM’s raw means, and to 
(2) KPM’s “U.S. corrected” means, 


respectively, are startlingly different from the composition of the U. S. 
white male population. (For example, although these sampled popula- 
tions have the U. S. average combination of education and rural- 
urban background, they have half of their members living in Indiana.) 
Since a complete probability sample seems to have been out of the 
question at the beginning of the KPM investigation, some such “sam- 
pled population” was to be expected, although it might have been some- 
what less distorted. Provided that further statistical analyses of the 
sort indicated in Appendix C, Chapter II-C were made, it would be 
possible to make adequate rigorous inferences from the sample to 
this ill-defined “sampled population.” 

The inference from these vague entities to the U. S. white male 
population depends on: 


(a) the inferrer’s view as to what these “sampled populations” are 
really like, and 

(b) the inferrer’s judgment as to how (reported) sexual behavior 
varies within segments. 


It is not surprising that experts disagree. 

The inference from KPM’s sample to the (reported) behavior of all 
U. S. white males contains a large gap which can be spanned only by 
expert judgment. This is a common phenomenon in social fields, but 
is still unfortunate. A considerable bridge across this gap would be 
furnished by a small probability sample. 





R 1953 


act of 
ious 
what 
nter- 


ed to 
‘om & 
inica- 
eable 


strik- 
lems. 
ions” 


pula- 
rural- 
ana.) 
f the 
‘sam- 
;ome- 
f the 
Id be 
le to 


male 


” are 


avior 


of all 
ly by 
, but 
ld be 








STATISTICAL PROBLEMS OF THE KINSEY REPORT 697 


22. Could KPM have used probability sampling? 


If probability sampling could have been used, its use would have 
avoided one of the main gaps in KPM’s present chain of inference. 
We have, therefore, considered this possibility carefully. 

The difficulties in applying probability sampling to KPM’s study 
lie in the expenditure of time required to make the contacts necessary 
to persuade a predesignated man to give a history. By adapting the 
mechanism of the probability sample to KPM’s situation, these dif- 
ficulties may perhaps be reduced (see Appendix D, Chapter V-D). 
It would almost certainly have been impractical for KPM to have used 
a probability sample in the early years of their study. If KPM’s ap- 
parent “opinions” (p. 39 of KPM) as to the effectiveness of their pres- 
ent techniques of contact are correct, starting a probability sample 
would have been practical at any time since the appearance of the 
male volume in 1948.' However, KPM (personal con.munication, 1952) 
feel that such an interpretation of their written statement is unwar- 
ranted. 

Since it would not have been feasible for KPM to take a large sam- 
ple on a probability basis, a reasonable probability sample would be, 
and would have been, @ small one, and its purpose would be: 


(1) to act as a check on the large sample, and 
(2) possibly, to serve as a basis for adjusting the results of the large 
sample. 


A probability sampling program planned to serve these purposes is 


discussed in Appendix D, Chapter VII-D. Such a program should 


proceed by stages because of the absence of information on costs and 
refusal rates. 

This conclusion about probability sampling does not excuse KPM 
from the responsibility for choosing geographical disproportion in 
order to save travel time and expense. The wisdom or unwisdom of this 
choice seems to depend on one’s view as to the magnitude of geographi- 
cal differences. Again, it is not surprising that experts disagree. 


CHAPTER VI. METHODOLOGICAL CHECKS 


23. Possible checks 


The primary check, if it could be made, is the comparison of average 
actual behavior with average reported behavior. Variability in the dif- 





1 “The number of persons who can provide introductions has continually spread until now, in the 
present study, we have a network of connections that could put us inte almost any group with which 
we wished to work, anywhere in the country.” (P. 39 of KPM.) 





698 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


ference between actual and reported behavior is secondary in interest, 
because high variability merely implies the necessity of larger numbers 
of cases, while large average differences between actual and reported 
behavior respresent a systematic error that cannot be adjusted without 
rather complete knowledge. Unfortunately this primary check does not 
at present seem feasible in studying human sexual behavior as it occurs 
in our culture. 

Of secondary importance are checks of the single actual report with 
the average actual report, where averages may be taken over fluctua- 
tions, time, spouses, and/or interviewers. (See Appendix A, Chapter 
V-A) In this second category, the following possible comparisons sug- 
gest themselves: 

1. Reinterviews of the same respondent 

2. Comparison of spouses 

3. Comparison of interviewers on the same population segment 

4. Duplicate interviews by the same interviewer at various times. 


24. KPM’s checks 


The only comparison of observed and reported behavior which KPM 
found feasible was the date of appearance of pubic hair, which agreed 
quite successfully. This is a physical characteristic, different in char- 
acter and emotional loading from the behavior of main interest. Some 


subjects may have had to rely upon general information, plus some 
assistance from the interviewer, in naming a date for themselves. 
Thus this check furnishes rather weak support. 

At the level of rechecks on respondents, some information is avail- 
able but more is needed. Similarly, comparisons of spouses have been 
made for a relatively selected group. The checks themselves are en- 
couraging, but more cases are needed. 

Some attempts have been made to compare the staff interviewers 
but since there is some selection in the assignment of cases, these 
comparisons do not meet the problem as squarely as interviews of the 
same respondent by different interviewers, or the recorded interview 
technique. 

A comparison of early versus late interviews by Kinsey is given in 
KPM, but it is hard to tell, for example, whether the 12.4% drop 
(from 44.9% to 32.5%) in the accumulative incidence for total pre- 
marital intercourse at age 19 (single males, education level 13+-) from 
early to late interviews is due to differing groups sampled, instability 
in the interviewing process, or reasonable sampling variation for cluster 
sampling (KPM p. 146), 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 699 


KPM have made serious efforts to check their work in the aspects 
where checking seems feasible. However, improved and more extensive 
checking is needed. Although duplicate recording of interviews is men- 
tioned, no data have been published. Even if they must be based on 
very few cases, such comparisons should be made available. 


CHAPTER VII. ANALYTICAL TECHNIQUES 
25. Variables affecting sexual behavior 


After introductory chapters (5 and 6) on early sexual growth and 
activity, KPM proceed to examine the effects of the following vari- 
ables: 

Age 

Marital status 

Age of adolescence 

Social level 

Comparison of two generations 

Vertical mobility in the occupational scale 
Rural-urban background 

Religious background 


In this chapter we attempt to appraise, in general terms, the analytical 
techniques used by KPM in their study of these variables. 


26. Definition of the variables 


Some of the variables: age of adolescence, social level, occupational 
level, rural-urban background and religious background, involve prob- 
lems of definition. These seem to have been in the main thoughtfully 
handled and presented by KPM. For instance, KPM discuss the rela- 
tive merits of educational level attained by the subject and of the oc- 
cupational class of the subject and of his parents as a measure of social 
level (pp. 330-32). In their opinion, educational level is the most satis- 
factory criterion and this was adopted for the analysis. In the case of 
religious affiliation, KPM distinguish between active and inactive pro- 
fession of religious faith, though the definition of the two terms is not 
made entirely clear. 

The definition which looks least satisfactory is that of age of adoles- 
cence (p. 299), where the problem is formidable. The criteria employed 
by KPM appear difficult for the reader to interpret. 


27. Assessing effects of variables 


With a multiplicity of variables which may interact on each other, 
the task of assessing the importance of each variable individually is 





700 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


not easy. Examination of the variables one by one, ignoring all other 
variables except the one under scrutiny, may give wrong conclusions, 
because what appears on the surface to be the effect of one variable 
may be merely a reflection of the effects of other variables. 

A thorough attack on this problem calls for a multiple-variable ap- 
proach in which all effects are investigated simultaneously. This re- 
quires a high degree of statistical maturity and of skill in presentation. 

The method utilized by KPM is a compromise. In general, with some 
exceptions, they regard age, marital status and educational level as 
basic variables, which are held fixed or compensated for in the investi- 
gation of each of the remaining variables. The other variables are dis- 
regarded for the moment. A\though we have not examined the matter 
exhaustively, this policy seems to have been justified by events, be- 
cause KPM claim from their analyses that the other variables, with 
the exception of age at adolescence, have had relatively minor effects. 


28. The measurement of activity 


In the KPM tables, activity is measured by “incidence” (per cent of 
the population who engage in the activity) as well as by frequency per 
week. In some tables, both mean and median frequencies are given, 
and also frequencies for the total and for the active population. There 
are advantages in presenting various measures. On the other hand, 
inspection suggests that all these measures are correlated: that is, 
to some extent they tell the same story. A complex internal analysis 
would probably show about how many measures are really needed to 
extract the information in the data and what individual measurements, 
or combinations of them, are best for this purpose. Perhaps a single 
one, or at most two, would suffice. As it is, both KPM and the indus- 
trious reader have to wade through tables and discussion of a number 
of different measurements, without being clear whether anything new 
is learned. Simplification would be pleasant, but is far from essential. 


29. Tests of significance 


In the discussion of effects which they regard as real, KPM make 
little appeal to tests of significance. They often present standard errors 
attached to the mean frequencies for individual cells. Because sampling 
was non-random and was by groups, these standard errors, calculated 
on the assumption of randomness, are under-estimates, perhaps by a 
substantial amount. The standard errors have a kind of negative vir- 
tue, in the sense that if a difference is not significant when judged 
against these errors, it would not be significant if a valid test could be 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 701 


devised. The problem of devising a realistic estimate of the true stand- 
ard errors is one of considerable complexity (see Section II-C). 

We have been unable to discover from the book the principles by 
which KPM decide when to regard an effect as real. The size of the 
effect is one criterion. Size should certainly be taken into account, 
since an effect may be significant statistically but too small to be of 
biological or sociological interest. They evidently attach some impor- 
tance to the consistency with which an effect is exhibited in different 
parts of a table. As a criterion, consistency is of variable worth. Con- 
sistency over different age groups (where age denotes age at the time 
of the reported activity) is of little worth, since there is inevitably 
substantial correlation between sampling fluctuations of reported 
activities at neighboring ages because the same subject appears in 
neighboring age groups. More weight can be attached to consistency 
over different educational levels, because different groups of subjects 
are involved. 

To summarize, statements about the data in their tables lie at the 
level of shrewd descriptive comment, rather than at the level of an 
attempt to make inferential statements from a sample to a clearly de- 
fined population (even though this could not be the U. S. white male 
population). 

We do not propose to discuss the analysis for each variable sepa- 
rately. Two analyses which have attracted much attention will be con- 
sidered later (Sections 33 to 37). 


30. U. S. Corrections 


In most sampling plans it is necessary to provide a set of weights 
for the segments of the sampled population to recover accurate esti- 
mates for the target population (i.e. the population about which 
inferences are desired). That such adjustments are usually appro- 
priate, whether probability or nonprobability samples are employed, 
has already been pointed out (Section 17, see Section II-G). 

Since KPM have as their target population U.S. white males, we can 
reasonably expect them to apply weights in an attempt to correct 
for disproportionate representation in the sampled population of some 
segments of the target population. 

KPM supply U. S. Corrections (p. 106-9) and use them rather con- 
sistently throughout the work. There are no examples given explaining 
the application of the weights. The critics, and sometimes this commit- 
tee, have had difficulty in verifying computations where they have 
been used. Of the 13 tables where corrections could be checked com- 





702 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10983 


pletely, one checked, 10 checked except for one age group each, and two 
were not checked by the correction mentioned in the text. Apparently 
the exposition could be improved. 

The U. S. Corrections should be used, but it might be possible to 
make a more effective choice of segments (see A-43 and V-C and II-G), 

KPM did not sufficiently warn the reader that U.S. corrected figures 
are not corrected for selection within segments, and may be seriously 
biased. 


31. The accumulative incidence curve 


KPM have a useful device for summarizing incidence data by age. 
This accumulative incidence curve gives the percentage of individuals 
in the sample (reporting for a given age) to whom a particular event 
has occurred before that age. Although the explanation of the concept 
of accumulative incidence is not as clear as most of KPM’s writing, the 
computations made are satisfactory. When there are no generation- 
to-generation changes in the population and no differential recall 
depending on age at report, this method is particularly justified, be- 
cause it packs all the incidence data neatly into one grand summary. 
(For discussion of the critics’ comments see A-39.) No better method 
for overall comparisons seems to be available. 


32. Other devices 


1. KPM did some extensive sampling experiments on their data, 
with a view to discovering the sample size needed for the accuracy 
they desired. These experiments turned out to be almost valueless 
because KPM did not take account of the necessary statistical princi- 
ples (see A-19). 

2. The committee had an opportunity to inspect the KPM facilities 
on a visit to Bloomington, Indiana. We observed that the data sheets 
were neatly filled out, that the files were well kept, that requests for 
original data were usually met in a matter of moments, and that the 
office was well equipped for handling the extensive data with which 
KPM deal. 

3. The KPM volume was written while data were still being col- 
lected. Apparently KPM chose to use all the data on hand at the time 
a particular point was being analyzed (personal communication from 
KPM). Thus different tables have different totals, a source of annoy- 
ance to critics and users of the book. The reasons for this should have 
been pointed out by KPM. The additional interviewing was deliber- 
ately selective with an aim to strengthen weak segments (personal 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 703 


communication from KPM). It seems to us that, if this strengthening 
was necessary for later analyses, it would have been worthwhile to add 
the new material to the early tabulations. This would also have in- 
creased comparability and avoided the problems raised by the exist- 
ence of many different sampled populations. 


CHAPTER VIII. TWO COMPLEX ANALYSES 


33. Patterns in successive generations 


In this chapter we discuss briefly two analyses by KPM which have 
attracted much attention. Our object is to give two specific illustra- 
tions of the kind of analysis which they chose to undertake, with 
comments on their competence. 

The first analysis was made by dividing the sample into two groups: 
those over 33 years of age at the time of interview, with a median age 
of 43.1 years, and those under 33 years at the time of interview, with a 
median age of 21.2 years. 

Our comments deal with three topics: (i) the statistical methodology 
employed (ii) KPM’s summary of their tables (iii) the general problem 
of inference from data of this type. 


34. Statistical methods 


In the comparisons, educational level and age at the time of the 
activity are held constant and in nearly all comparisons marital status 
also. The method used to compare the group means seems satisfactory 
except for some minor points, discussed in A-25, A-33 and A-43. 

It would have been helpful to present classifications of the older and 
younger groups according to other factors which might influence sexual 
activity, e.g., rural-urban background, religious affiliation, marital 
status at age 20 or 25. The two groups would not necessarily agree 
closely in these break-downs, for there has been a slow drift towards 
the towns, and perhaps a drift towards “inactive” rather than “active” 
religious affiliation. For interpretive purposes it is advisable, in any 
event, to learn as much as possible about the compositions of the older 
and younger groups. Some critics have claimed that the older genera- 
tion is “atypical.” 


35. KPM’s summary of their tables 


The data are presented in 8 large tables (98-105). As a statistician 
learns from experience, a competent summary of a large body of data 
is not an easy task. KPM give a detailed discussion of the accumulative 





704 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


incidence data for each type of outlet, followed by a similar discussion 
of the frequency data. 

These detailed comments on what the data appear to show seem 
sound, except that on two occasions where the younger group showed 
greater sexual activity, KPM ignored or played down the difference 
between the two groups (Section A-45). 

Their general summary statement reads in part as follows: 


“The changes that have occurred in 22 years, as measured by the data 
given in the present chapter, concern attitudes and minor details of be- 
havior, and nothing that is deeply fundamental in overt activity. There has 
been nothing as fundamental as the substitution of one type of outlet for 
another, of masturbation for heterosexual coitus, of coitus for the homo- 
sexual, or vice versa. There has not even been a material increase or decrease 
in the incidences and frequences of most types of activity. ... 

“And the sum total of the measurable effects on American sexual be- 
havior are slight changes in attitudes, some increase in the frequency of 
masturbation among boys of the lower educational levels, more frequent 
nocturnal emissions, increased frequencies of premarital petting, earlier 
coitus for a portion of the male population, and the transferences of a per- 
centage of the pre-marital intercourse from prostitutes to girls who are not 
prostitutes.” 


Some critics have objected strongly to this statement, particularly 
the first paragraph, on the grounds that it gives a biased report by 


brushing aside the differences in activity, which are almost all in the 
direction of higher or earlier sexual activity by the younger group. 
The reporting does appear a little one-sided, in that the reader is en- 
couraged to conclude that the differences are immaterial, although 
KPM do not state what they mean by a “material” increase. On the 
other hand, the catalogue of differences, given at the end of the second 
paragraph above, includes all differences noted either by KPM or 
the critics, except for an increased homosexual activity in the younger 
group at educational levels 0-8 and 9-12. 


36. Validity of inferences 


Two objections have been made by some critics to any inferences 
drawn from a comparison of this type. The first is that the groups may 
not be representative of their generations. KPM have attempted to 
dispose of this objection, at least in part, by holding educational level 
and marital status constant. It might be possible to go further and hold 
other factors constant, or at least examine whether the samples from 
the two generations differ in these factors. But with non-random 
sampling the objection is not removed even if a number of factors are 





IR 1983 


Ission 


seem 
Owed 
rence 


. data 
of be- 
e has 
et for 
omo- 
ease 


1 be- 
cy of 
juent 
arlier 
| per- 
2 not 


arly 
; by 
the 


ugh 
the 
ond 
or 
ger 


STATISTICAL PROBLEMS OF THE KINSEY REPORT 705 


held constant, because one or both groups might be biased with respect 
to some factor whose importance was not realized. Various opinions 
may be formed as to the strength of the objection, but it can be re- 
moved only by the use of probability sampling accompanied by valid 
tests of significance. 

Secondly, in a comparison of this type, the older generation is describ- 
ing events which involve a much longer period of recall, with a possi- 
bility of distortion as events become distant. Further retake studies, 
if KPM can continue them for a sufficiently long period, may throw 
some light on the strength of this objection. 

The joint effect of these objections is to render the conclusions tenta- 
tive rather than definitely established. 


37. Vertical mobility 


This analysis (pp. 417-47) shows a degree of ingenuity and sophisti- 
cation which is not too common in quantitative investigations in soci- 
ology. The data are arranged in a two-way array according to the oc- 
cupational class of the subject at the time of interview and the occupa- 
tional class of the parents. KPM examine whether the pattern of sexual 
activity of the subject is more strongly associated with the parental 
occupational class than with that attained by the subject. They con- 
clude (p. 419) 

In general, it will be seen that the sexual history of the individual accords 
with the pattern of the social group into which he ultimately moves, rather 
than with the pattern of the social group to which the parent belongs and 
in which the subject was placed when he lived in the parental home. 

The most significant thing shown by these calculations (Tables 107-115) 


is the evidence that an individual who is ever going to depart from the 
parental pattern is likely to have done so by the time he has become adoles- 


cent. 


The amount of data which KPM present in this analysis is worth 
mention as evidence that they do not shirk work. Tables are given for 
7 types of activity. Three age groups are shown in each table. When 
we classify by occupational level of subject and parent, this leads to 
21 two-way tables. Five measures of the type of activity are given, so 
that a painstaking examination extends over 105 two-way tables. 

KPM appear to have paid most attention to the frequency data. 
Their task is to determine whether this shows a stronger association 
with the occupational class of the subject or of the parent. In reaching 
a verdict, they rely on judgment from eye inspection. By a similar eye 
inspection, we agree with their verdict as a descriptive statement of 





706 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


what the data indicate, although different individuals might disagree 
as to how definitely their statement holds. Judgments made by one 
individual for the data on frequencies were that in 7 of the 21 two-way 
tables, association with subject and parent either was not present at 
all or looked about equal. In 9 it looked mildly more with the subject 
and in 5 it looked strongly more with the subject. 

It would be of interest to undertake a more objective analysis. Analy- 
sis of variance techniques are available for this purpose, although some 
theoretical problems remain. 

So far as interpretation is concerned, the principal disturbing factor 
is the possibility, which some critics have mentioned, that the subject’s 
reports of his activity are influenced by the social level to which he 
belongs at the time of interview. KPM maintain that attitudes towards 
different types of activity are strongly affected by the social level of the 
subject. Whether they change when he changes his social level would 
be interesting to discover. Something might be learned by retakes for 
subjects who had moved in the social scale. To obtain an abundant body 
of data of this kind will, however, be a slow and difficult process. 


CHAPTER IX. CARE IN INTERPRETATION 
38. Sample and sampled population 


In sample surveys, the inference from sample to sampled population 
is often relatively straightforward, although not trivial. We can usually 
set limits so that the statement “the sample agrees with the sampled 
population within these limits” has approximately the agreed-upon 
risk. (We may have to work fairly hard to set these limits correctly.) 
But we have always to remember, and usually must remind the reader 
steadily, that these limits are not infinitely narrow. 

KPM’s caution on page 153 (quoted in Appendix A, Section 48) is 
a caution, but it is not repeated. 

In general, their statements about small differences are more forth- 
right than we would care to make. 


39. Sampled population and target population 


When a respectable approximation of a probability sample is in- 
volved, the step from sampled population to target population is usual- 
ly short and the inference strong. Otherwise, the inference is often 
tortuous and weak. It depends on subject matter knowledge and intui- 
tion, and on other barely tangible considerations. These considerations 








ER. 1953 


isagree 
IV One 
'O-Way 
sent at 
ubject 


Analy- 
1 Some 


factor 
yject’s 
ch he 
wards 
of the 
would 
es for 
body 
3S. 


STATISTICAL PROBLEMS OF THE KINSEY REPORT 707 


deserve to be brought to the reader’s attention, and to be discussed 


as best the authors may. 
This KPM did not do adequately. Their discussion of diversification 


(p. 92) and 100 per cent samples (p. 93) is only a beginning. 


40. Systematic errors of measurement 


Any quantitative study offers the possibility of systematic errors 
of measurement. It is generally agreed that these possibilities should be 
placed before the reader and discussed. 

In KPM’s study these possibilities concentrate on the difference 
between present reported and past actual behavior. KPM spent 
Chapter 4 on this question. Their discussion is generally good, except 
on some questions which arise in connection with generation-to- 
generation comparison (see Sections A-25 and A-44). 


41. Unsupported assertions 


We are convinced that unsubstantiated assertions are not, in them- 
selves, inappropriate in a scientific study. In any complex field, where 
many questions remain unresolved, the accumulated insight of an ex- 
perienced worker frequently merits recording when no documentation 
can be given. However, the author who values his reputation for ob- 
jectivity will take pains to warn the reader, frequently repetitiously, 
whenever an unsubstantiated conclusion is being presented, and will 
choose his words with the greatest care. KPM did not do this. 

Many of the most interesting statements in the book are not based 
on the tabular material presented and it is not made at all clear on what 
evidence the statements are based. Nevertheless, the statements are 
presented as if they were well-established conclusions. 


42. Some major controversial findings 


Some KPM findings about which much scientific discussion has cen- 
tered relate to: 


(i) stability of sexual patterns, 
(ii) homosexuality, and 
(iii) the effects of vertical mobility. 


In all these areas KPM have made forthright and bold statements. 
As discussed in more detail in Sections A-45 to A-47 (also see A-25), 
there are reasons for caution in every one of the three areas. 





708 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


CHAPTER X. COMPARISON WITH OTHER STUDIES* 
43. Interviewing 


Good sex studies have been made using both the personal interview 
and questionnaire techniques. Given that just one technique is to be 
employed, KPM’s choice of personal interview seems necessary if 
illiterates or near-illiterates are to be sampled. At present, it is good 
practice in gathering this type of data to endeavor to have all subjects 
give information on as many relevant points of the study as possible. 
No study seems to have done better on this matter than KPM. 

Whether it is always good practice to standardize the questions 
asked is debatable. KPM did not do this and give telling arguments 
against the practice. Some other studies have standardized the ques- 
tions, both in personal interview and in self-administered question- 
naires, and they have included good arguments in favor of their pro- 
cedure. In training interviewers KPM seem to have gone to greater 
lengths (a year of training) in preparing for the specific interview used 
in the study, than any of the other personal interview studies. Informa- 
tion on training of interviewers is fairly hard to come by in all these 
studies. 

Given the choice of personal interview, it is not possible at this writ- 
ing to be logically certain whether the KPM technique is better or 
worse than that of the other interview studies, no matter whether one 
approves or disapproves of the tactics of a diagnostician or medical 
detective. Some discussion of how the KPM interview appeared to 
us is given in Appendix E. Numerous cross-checks on frequency and 
dates of occurrences appear within the KPM interview, while they 
seem to be lacking in most other studies. Setting aside points on which 
there is no evidence, KPM’s interviewing is as good as or better than 
that of the other studies reviewed. 





* The material in this chapter is our inference from the reviews supplied by W. O. Jenkins and 
presented in Appendix B. We have not personally read all the volumes concerned. The volumes are as 
follows: 

Bromley, Dorothy D., and Britten, Florence H. Youth and sex. New York: Harper and Brothers, 1938. 
Davis, Katherine B. Factors in the sex life of twenty-two hundred women. New York: Harper and Brothers, 

1929. 

Dickinson, R. L., and Beam, Lura A. The single woman. Baltimore: Williams and Wilkins Co., 1934. 
Dickinson, R. L., and Beam, Lura A. A thousand marriages. Baltimore: Williams and Wilkins Co., 1931 
Farris, E. J. Human fertility and problems of the male. White Plains, N.Y.: Author’s press, 1950. 
Hamilton, G. V. A research in marriage. New York: A. and C. Boni, 1929. 

Kinsey, A. C., Pomeroy, W. B., and Martin, C. E. Sexual behavior in the human male. Philadelphia: 

W. B. Saunders Company, 1948. 

Landis, C., et al. Sex in development. New York and London: Paul B. Hoeber, 1940. 
Landis, C., and Bolles, M. M. Personality and sexuality of the physically handicapped woman. New 

York and London: Paul B. Hoeber, 1942. 

Terman, L. M., et al. Psychological factors in marital happiness. New York: McGraw-Hill Book Co., 1938 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 709 


44. Checks 


As for checks on the interviewing process, KPM unquestionably 
lead the field with 100 per cent samples, retakes, spouse comparisons, 
early vs. late groups, interviewer comparisons, and the pubic hair study. 
Some authors mention casual checks with no data supplied. Bromley 
and Britten compare interview and questionnaire results on different 
groups. Davis reports a study where 50 subjects were interviewed 
before and after questionnaire administration, and offers a breakdown 
py consecutive 100 questionnaires received. Dickinson and Beam’s 
two books speak of comparing verbal reports and physical examination 
results as a way of verifying the record rather than as a check—no 
records seem to be published. Farris’ comparison of reported vs. per- 
sonally recorded masturbatory rates omits the critical comparative 
information. Hamilton finds that different question wordings give 
different responses, but leaves the matter here. Landis and Bolles use 
several independent judges for evaluation of scales—but, instead of 
comparing their results, argue that agreement will be good because of 
experience and training. They do not compare normal with handicapped 
subjects. Landis checks with the psychiatric case history as a means 
of eliminating subjects with discrepancies, and gives data on the agree- 
ment of independent judges’ ratings. Terman offers spouse comparisons. 
When KPM’s checks are viewed with those of the other leading sex 


studies in mind, it is clear that a new high level has been established. 


45. Sampling 


All studies used volunteer non-probability samples. Some were drawn 
from more specifiable target populations than others. For example, 
Bromley and Britten drew exclusively from college volunteers, while 
Davis used mail-questionnaire respondents from lists of Women’s 
Clubs and college alumnae. Others used well-to-do patients, or clinic 
groups. Aside from KPM, Bromley and Britten is the only study that 
seems to have attempted to get nationwide geographic representation 
(we have omitted M. J. Exner’s 1915 study), while Davis has covered 
the eastern area, and Terman covers part of the California area. Al- 
though KPM’s sample is heavily charged with college students, a 
broader representation of social and educational levels is offered than 
in the other studies. All studies reviewed have special features which 
make generalizations to specific populations difficult. Certainly KPM’s 
sampling seems never worse and often better than that of the other 
studies, 





710 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


46. Analysis 


Most studies confined their analysis to simple descriptive statistics— 
percentages, means, and medians. A few added ranges, standard devia- 
tions, correlation coefficients, and attempted significance tests. About 
half used two-way breakdowns, usually on background characteristics, 
as a way of sharpening differences between groups. Three studies 
offered scales either based on judges’ evaluations (Landis, and Landis 
and Bolles), or scoring of batteries of items (Terman). KPM restricted 
the use of scales to occupational classification and homosexual-hetero- 
sexual rating. They added the accumulative incidence curve, the U. §. 
corrections, and extensively used fine-grained (high-order) breakdowns. 
In general, KPM’s analysis employed more devices and was more 
searching than the analyses offered by other studies. 


47. Interpretation 


We have already mentioned (33) that KPM are competent at the 
accurate and understandable verbal description of the meanings of a 
table whose entries are taken as correct. Some of the other authors 
have also done well, although the extent of their analysis is usually 
more limited. In inferring from sampled population to target popula- 
tion, all the studies are weak. The inferences left with the reader (if 
we are to judge) are much broader than the studies could possibly 
warrant. Every study has its own precautionary remarks to the effect 
that the reader must not extend the inferences beyond that of the 
population studied. Very little attempt is made to describe the target 
population, to help the reader with the step from sample to sampled 
population, or to remind him of sampling fluctuations. The precaution- 
ary remarks in the opening pages of a study are usually forgotten when 
the authors come to discuss matters of national policy, morals, legisla- 
tion, therapy, and psychological and sociological implications toward 
the end of their book. The reader must then be left with the inference 
that the findings apply on at least a national scale. Bromley and Britten 
are more forthright than most. They argue overtly that their volunteer 
college sample is a representative of all U.S. individuals of college age. 
Of the 10 studies considered, only two, Davis and Farris, seem to have 
consistently exercised due caution about generalization from sample 
to population and warnings to the reader. The last paragraph of the 
section entitled, “Description of Sample and Sampling Methods” 
in each review in Appendix B gives one reader’s opinion of the general- 
izations from sample to sampled population intended by the author. 

Our reviewer was not asked to gather data that would give us a way 
of comparing the extent of unsupported statements in the other stud- 





STATISTICAL PROBLEMS OF THE KINSEY REPORT 711 


ies with those of KPM, so this aspect of interpretation remains uncom- 
pared by us. It would be very interesting if someone would collect 
such information, not only in connection with the present work, but 
with regard to general scientific writing in various fields. This would be 
no small task. 


CHAPTER XI. CONCLUSIONS 
48. Interviewing 


(1) The interviewing methods used by KPM may not be ideal, but 
no substitute has been suggested with evidence that it is an improve- 
ment. 

(2) The interviewing technique has been subjected to many criti- 
cisms (see Section A-11), but on examination the criticisms usually 
amount to saying “answer is unknown,” or “KPM have not demon- 
strated how good their method is.” 

These conclusions can be summarized by saying that we need to know 
more about interviewing in general. 


49. Checks 


(1) The types of methodological checks considered by KPM seem 
to be quite inclusive. 

(2) A greater volume of checks—more retakes, etc. is desirable, as is 
more delicate analysis. (See Sections C-15 and C-18.) 

(3) The results of duplicate recording of interviews should be pub- 
lished. 

These conclusions can be summarized by saying that KPM’s checks 
were good, but they can afford to supply more. 


50. Sampling 


Given U. S. white males as the target population, our conclusions 
are that: 

(1) KPM’s starting with a nonprobability sample was justified. 

(2) It should perhaps already have been supplemented by at least 
a small probability sample. 

(3) Iffurther general interviewing is contemplated, and perhaps even 
otherwise, a small probability sample should be planned and taken. 

(4) In the absence of a probability-sample benchmark, the present 
results must be regarded as subject to systematic errors of unknown 
magnitude due to selective sampling (via volunteering and the like). 


51. Analysis 


KPM’s analysis is best described as simple and relatively searching. 
They did not use such techniques as analysis of variance or multiple 











712 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


regression, but they brought out the indications of their data in a work- 
manlike manner. 

In more detail: 

(1) their selection of variables for adjustment seemed to be a reason- 
ably effective substitute for more complex analyses, 

(2) they gave several measures of activity (giving the reader a choice 
at the expense of more tables to examine), 

(3) they made essentially no use of tests of significance, but cited 
many standard errors (which were inappropriate for their cluster sam- 
ples), 

(4) they used U. S. Corrections and their (independently developed) 
accumulative incidence curve. More careful exposition of these devices 
would have been desirable. 

To summarize in another way: 


(i) they did not shirk hard work, and 
(ii) their summaries were shrewd descriptive comments rather than 
inferential statements about clearly defined populations. 


Their main attempt at inferences was a sample size experiment whose 
results (i) could have been predicted by statistical theory, (ii) were 
irrelevant to their cluster sampling. 

They continued to add new interviews without redoing earlier tabu- 
lations, thus producing an unwarranted effect of sloppiness in the book, 
although their records were kept carefully and in unusually good 
shape. 


52. Interpretation 


(1) KPM showed competence in accurate and understandable ver- 
bal description of the trends and tendencies indicated by their tables. 
In stating and summarizing what the sample seems to show, they 
were competent and effective. 

(2) Their discussion of the uncertainties in the inferences from the 
numbers in the tables to the behavior of all U. S. white males was brief, 
insufficiently repeated, and oftentimes entirely lacking. In instilling 
due caution about sampling fluctuations and differences between 
sampled and target populations, they were lax and ineffective. 

(3) Their discussion of systematic errors of reporting is careful and 
detailed (with the exception of some questions bearing on generation 
comparisons). 

(4) Many of their most interesting statements are not based on the 
tables or any specified evidence, but are nevertheless presented as 
well-established conclusions. Statements based on data presented, 
including the most important findings, are made much too boldly and 





1953 
rk- 


yn- 
ice 
ed 


d) 
es 








STATISTICAL PROBLEMS OF THE KINSEY REPORT 713 


confidently. In numerous instances their words go substantially beyond 
the data presented and thereby fall below our standard for good scien- 
tific writing. 

53. Comparison with other studies 


In comparison with nine other leading sex studies, KPM’s work is 
outstandingly good. 

In more detail, 

(1) their interviewing ranks with the best, 

(2) they have more and better checks, 

(3) their geographic and social class representation is broader and 
better, 

(4) their volunteer non-probability sample problem is the same, 

(5) they used more varied and searching methods of analysis, 

(6) only two of the nine studies (Davis and Farris) were more care- 
ful about generalization and warned the reader more thoroughly 
about its dangers. 


Thus, KPM’s superiority is marked. 
54. The major controversial findings 


It is perhaps fair to regard these four as KPM’s major controversial 
findings: 

(1) a high general level of activity, including a high incidence of ho- 

mosexuality, 

(2) a small change from older to younger generations, 

(3) a strong relation between activity and socio-economic class, 

(4) relations between activity and changes of socio-economic class. 

Al! of these KPM set forth as well established conclusions. All are 
subject to unknown allowances for: 

(a) difference between reported and actual behavior, 

(b) nonprobability sampling involving volunteering. 

While their findings may be substantially correct, it is hard to set 
any bounds within which the truth is statistically assured to lie (see 


Appendix A, Section 4.) Once again, we wish to point out that the same 
difficulties are present in many sociological investigations. 


CHAPTER XII. SUGGESTED EXTENSIONS 
55. Probability sampling 
Appendix D discusses the advantages, possibilities and difficulties 
of probability sampling in some detail. 
In brief summary: 











714 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


(1) Costs and refusal rates together determine the wisdom of exten- 
sive probability sampling. 

(2) Information on costs and refusal rates is lacking. 

(3) Hence probability sampling should begin on a very small scale, 
say 20 cases. 

(4) Astep-by-step program, starting at such a scale, seems wise, and 
is recommended to KPM. 


56. Retakes 


While retakes showed high agreement on vital statistics, and moder- 
ately high agreement on incidence, the data presented in KPM for 
frequencies show considerably less agreement. The data do not make 
clear how much better a retake agrees with a take than with a randomly 
selected interview for another subject with the same age, religion, 
social class, etc. 

If the agreement is better, then retakes will provide evidence as to 
non-random agreement—evidence bearing on the much-discussed sub- 
ject of the constancy of recall. In addition, take-retake differences are 
clearly so large as to make retakes of two old subjects at least as valu- 
able as a take of one new subject in determining the average behavior 
of groups (see Section A-24). 

If the agreement is no better, then retakes will provide evidence 
that this was so, and every retake will be as valuable as a new take in 
determining the average behavior of groups. 

In our opinion 500 retakes would help the standing of KPM’s data 
more than 2000 new interviews (selected in the same old way). It would 
of course be important to determine and report the selective factors 
which influenced the selection of the retaken subjects. 


57. Spouses 


Separate interviews of husband and wife are a useful supplement to 
retakes, in that they supply the nearest approach to two independent 
reports of the same action, although the information is restricted for 
the most part to marital coitus, and is weakened by the possibility 
of collusion. In the book, KPM present comparisons for 231 paizs of 
spouses. 

In an expansion of this program, various elaborations could be sug- 
gested. The first objective should probably be to interview more pairs 
from the lower educational levels, in order that the agreement between 
spouses can be examined separately for different educational levels. 
As in the case of retakes, the data are not wasted so far as the main 
study is concerned, since they contribute both to the male and female 
samples. 








STATISTICAL PROBLEMS OF THE KINSEY REPORT 715 


xten- 


58. Presentation 


As the critics point out (Chapters VII-A, I-C), parts of the book 
cale, are hard to understand because of lack of clarity of presentation. In 
future editions, the following steps would remove the major ambigui- 

, and ties. 
(i) KPM should explain why the numbers of cases change erratically 
from table to table. In future publication it would be worth substantial 


der- effort to avoid these changes. 

| for (ii) Table headings and contents should be critically reviewed as to 

nake their lucidity. 

mily (iii) Worked examples of the calculation of U. 8. corrections should 

ion, be given. References under the tables to the variables used for correc- 
tion should be more precise. 

s to (iv) More discussion should be given, with numerical illustration, 

sub- of the meaning of accumulative incidence percentages. 

are (v) More information should be given about the questions asked, 

alu- with their variations, in the interview. Although this would be extreme- 

reer ly laborious to do for the complete interview, one or two blocks of re- 
lated questions might serve the purpose. For such a block, KPM might 

wae describe (a) the variations used in the statement of the questions (b) 

én the variations in the order of questions (c) the reasons for the varia- 
tions. An illustration of this type would give deeper insight into the 

ata logical structure of KPM’s interviewing technique and might go far 

uld to substantiate their claim (p. 52) that flexibility is one of the strengths 

— of their technique 


(vi) Several critics make a strong plea that more information be 
given about the composition of the sample (see Chapters I-A, I-C). 


to The specific items requested vary with the critic, and some would be 

ant a major undertaking both in preparation and publication. A minimum 

for that seems feasible would be to present a multiple classification of the 

ity subjects according to the following items at the time of interview: age, 

of marital status, occupation, educational status, religious affiliation, place 
of residence. In addition, more information is needed about the extent 

ig- to which special groups (e.g., those in penal institutions, homosexual 

irs groups) contribute to the tables. 

en 

Is. 59. Statistical analyses 

in 


I In Appendix C, a number of statistical analyses are outlined which 
as would be a useful contribution to the methodology of studies of this 
kind. The analyses would require expert statistical direction. 











716 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


As has been pointed out, the standard errors presented by KPM 
are invalid, because they were computed on the assumption of random 
sampling of individuals. A method for calculating standard errors so 
as to take into account the actual nature of KPM’s sampling is given 
in Chapter II-C. These standard errors would allow a realistic appraisal 
of the stability of KPM’s means. They would indicate by how much 
the means determined from the present KPM sample are likely to vary 
from the means of a much larger sample of cases obtained by the KPM 
methods. 

KPM described orgasm rates in terms of per cent incidence and mean 
or median frequency. However, other mathematical functions of these 
variables may be more appropriate, leading to simpler statements of 
the results. Approaches for investigating this question, and the related 
question of the use of some combination of the variables, are suggested 
in Chapters ITI-C and IV-C. 

The question of applying adjustments to segment means has already 
been discussed (Section 17). A technique is presented (Chapter V-C) 
for reaching practical decisions on the appropriateness of adjustment 
and on the number of variables for which adjustment should be made. 


60. Relative priorities 


We give here our personal collective opinion as to how further effort 
on the male study might best be spent (we have not tried to evaluate 
priorities in comparison with the female study, or any other studies 
which KPM may contemplate). 

If the interviewer time which it would require were available, we 
believe that the effort required for the proposed probability sample 
would be worthwhile. 

So long as it did not interfere with the possibility of a probability 
sample, available interviewer time should be concentrated: 


on retakes when working in or near old areas. 
on husband-wife pairs when two interviewers are available. 


If the probability sample has already been ruled out, and if fewer 
interviewer months are available, then an attempt to retake a random 
sample of previous subjects would be most desirable, whenever possible, 
husband and wife being taken whenever either is retaken. 

Effort in the form of statistical analysis and presentation need not 
interfere with interviewing, and should be pressed to the extent that 
experienced and understanding personnel can be found. 





THE INVENTORY PROBLEM* 


J. LADERMAN, Office of Naval Research 
8. B. Lirraver, Columbia University 
Lionet Weiss, University of Virginia and Cornell University 


HIs article is expository, and is based on the two papers, “The 
Inventory Problem,” by A. Dvoretzky, J. Kiefer, and J. Wolfowitz, 
which appeared in the April 1952 and July 1952 issues of Econometrica. 
These papers are too advanced mathematically to be read by many 
of those to whom the results might be of interest. It is hoped that this 
paper will bring the new technique to the attention of those persons, 
both in government and private industry, who are responsible for mak- 
ing decisions affecting the amount of inventory to be held by their 
organizations. In the opinion of the present authors, great economies 
can result from the application of this new inventory control technique. 
The inventory problem can be stated very simply: it is to decide 
how much material to stock in preparation for an uncertain future. 
Both understocking and overstocking are costly, else there is no prob- 
lem. If overstocking is not penalized, such large stocks could be held 
that no conceivable future occurrence would deplete them; if under- 
stocking is not penalized, zero stocks could be held. The usual cases, 
where both understocking and overstocking are costly, are the ones of 
interest here. For exgmple, the proprietor of a restaurant, buying 
perishables for the day, will see them spoil if he buys too many, or will 
turn customers away unsatisfied if he buys too few, thus failing to earn 
potential profits and perhaps permanently losing some customers. Even 
if a merchant does not deal in perishables, overstocking may involve 
carrying costs which include such items as rent, insurance, deprecia- 
tion, loss of interest on capital invested, etc. As a less homely example, 
an army would certainly be heavily penalized for being caught short of 
ammunition, but since there are other important items needed by an 
army, it would be possible to stock too much ammunition’ at the 
sacrifice of other military items. 
The reader can no doubt think of other cases, closer to home, where 
a balance must be struck between overstocking and understocking. 
The purpose of this article is to describe a method of striking this bal- 
ance so as to minimize the losses to be expected from taking the risks 





* This work was sponsored by the Office of Naval Research. 
717 











718 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


of overstocking or understocking, which are unavoidable when one 
has to provide for an uncertain future demand. 

As a first step, we describe a simple but impo:tant concept, the 
“schedule of losses.” The schedule of losses is a schedule which shows 
what the loss is for any given combination of stock held and future de- 
mand. A profit is regarded as a negative loss. The schedule of losses can 
often be simply expressed by a mathematical formula. For example, 
suppose a newspaper vendor buys y papers from the publisher for 3 
cents a copy, sells d copies for 5 cents a copy, and resells the unsold 
copies to the publisher for 1 cent a copy. Then his loss in cents is 
—2d+2(y—d), because he makes 2 cents profit on each of the d pa- 
pers he sells and loses 2 cents on each of the (y—d) papers he returns, 
where of course the number sold to customers, d, cannot exceed the 
number bought from the publisher, y. Thus for any possible combina- 
tion of numbers of papers bought and sold, we can compute the loss 
incurred by the vendor. The schedule of losses in this case is typical 
of many situations where unsold stock depreciates in value (perhaps 
even becomes worthless). 

The schedule of losses includes losses arising from the four following 
categories: 


a) Negative of the profit from a transaction or other gain from the completion 
of a mission. 


b) Carrying costs which are the losses arising from the stocking of the com- 
modity. 

c) Losses due to depletion which arise when the demand exceeds the available 
supply. 

d) Ordering costs which are the costs involved in processing an order to 
change the inventory level. 


In the above newspaper example the schedule of losses involved only 
items a and b. The —2d represented the negative of the profit from the 
sale of the d papers and the 2(y—d) represented the loss due to ob- 
solescence of the papers (a carrying cost). 

It is almost inconceivable that a person responsible for making deci- 
sions would not have a fairly good idea of what the schedule of losses 
is for his case. In the absence of any knowledge at all about the sched- 
ule of losses, it is difficult to imagine on what rational grounds the size 
of inventory can be set. From now on, we shall assume that the sched- 
ule of losses is known, at least approximately, and shall then describe 
a method of choosing the size of inventory to be held. 

First we discuss a particularly simple case where stock can only be 





— 


ann She Gee aot (06 ee 6 CO GO 06 lee Coe Co 





1953 


one 


the 
Ws 
de- 
can 
le, 
r3 
old 
} is 
pa- 
ns 
the 
na- 
OSs 
cal 


ips 
ng 
ion 


m- 


ble 





THE INVENTORY PROBLEM 719 


ordered or returned to the supplier (a negative order) at the beginning 
of a time interval, and only the stock held after the order is placed can 
be used to meet the demand that will arise during the interval. We as- 
sume that there is instantaneous delivery of the order, and also that the 
future demand is completely known. It is this last assumption that 
makes this case so simple, for once the schedule of losses is known and 
the future demand is known, we simply place an order of a size that will 
minimize the loss. Thus, in the case of the newspaper vendor above, it 
is clear that the number of copies he should buy from the publisher is 
the number of copies he will be able to sell (i.e. the total demand). For 
he loses 2 cents on each unsold paper, but makes 2 cents on each paper 
he sells. And in any other case where the future demand is known, it 
is 2 simple matter to choose an order that will minimize the loss. 

It is when the future demand becomes uncertain that we meet diffi- 
cult and more realistic cases. First we discuss how we shall interpret 
“uncertain future demand.” We will not interpret this as meaning com- 
plete lack of knowledge about future demand, nor, obviously, do we 
mean that we know exactly what the future demand is going to be. 
“Uncertain future demand” to us shall mean something between com- 
plete lack of knowledge and complete certainty ; namely, that future de- 
mand is a chance variable with a known probability distribution. In 
other words, future demand may have any one of several values, with 
known probabilities. The reader will perhaps inquire under what cir- 
cumstances we would be justified in regarding demand as a chance 
quantity. We will not try to give a complete answer here, but will note 
that demand may depend on various factors of a chance nature, there- 
by making demand itself a chance quantity. For example, demand may 
depend upon the weather, which itself is frequently considered as 
though it depends on chance. 

To make these ideas more specific, let us take the case of the news- 
paper vendor discussed above. Suppose he is located in a suburban rail- 
road station, and that each morning there are 200 customers who reach 
the station early enough to buy a paper from him. Another 50 potential 
customers arrive at the station in a bus. If the bus arrives early, each 
of the 50 buys a paper, but if the bus arrives late, none of the 50 has 
time to buy a paper. Let us assume that the bus arrives late half of the 
time, and there is no way of telling beforehand on any day whether or 
not the bus will be late. Then it is clear that the demand for the ven- 
dor’s papers on any given day is a chance variable which can take the 
value 200 with probability 4, or 250 with probability 4. This means 
that, in the long run, on } of the days the demand will be for 200 pa- 





720 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


pers, on the other 4 of the days, for 250 papers. How many papers 
should the vendor buy from the publisher in this case? 

The vendor’s loss will be a chance variable because it will depend 
upon the demand, itself a chance variable. The probability distribution 
of the loss will depend upon the number of papers the vendor buys 
from the publisher. (We remind the reader that the distribution of a 
chance variable is simply a list of the possible values of the chance vari- 
able with their respective probabilities.) Roughly speaking, it is clear 
that the size of the vendor’s purchase from the publisher should be such 
as to make the probabilities of large losses small. This rather vague 
statement, however, is not explicit enough to enable us to decide just 
what the size of the vendor’s purchase should be. Many different inter- 
pretations can be made, but throughout the remainder of this paper we 
are going to choose the size of the order to make the expected value of 
the loss as small as possible. A justification for using the procedure 
which minimizes the expected value of the loss is that such a procedure 
is the best one to use if one wishes to minimize the average loss in the 
long run. In the next paragraph we shall review briefly the concept of 
expected value. 

If one takes many observations on a chance variable, the average of 
the observations will ordinarily tend to some number. That number is 
called the expected value of the chance variable. More precisely, for 
our purposes the expected value of a chance variable may be defined 
as the weighted average of all the values the chance variable can take 
on, with the probability of each value as its weight. For example, sup- 
pose a chance variable can take on the values, 1, 2, or 3 with probabili- 
ties of 4, 4, 4 respectively. Then the expected value is given by }(1) 
+4(2)+4(3) =2. Clearly, if many observations are made on this chance 
variable, about 3 of them will have the value 1, about } of them will 
have the value 2, and the remaining ones will have the value 3, so the 
average of all the observations will usually be close to 2. 

It is clear from the discussion of the preceding paragraph that choos- 
ing the size of the order to minimize the expected value of the loss is 
not an unreasonable procedure, since the smaller the probabilities of the 
larger losses, the smaller the expected value of the loss. Also, if such a 
policy is applied over and over, the average loss will usually be less than 
that obtained from any other policy. 

Before we actually compute the size of the order that will minimize 
the newspaper vendor’s expected loss, we shall discuss a possible objec- 
tion to our whole procedure. The practical definition of probability is in 
terms of the “long-run.” That is, when we say that the probability of 





> 


an ee ah ees lUureerkelUCreelC hlUClC<C«ir!/. 2h OU lCOU ee Oe lc CU lc CO 


THE INVENTORY PROBLEM 721 


an event is $4, we mean that in a long series of experiments or trials, 
the event will occur about $ of the time. Therefore, it might be asked, of 
what use is probability theory, whose statements in practice refer to the 
long run, in a problem like that of the newspaper vendor, who is con- 
cerned with his loss in one particular day? In many cases, no answer to 
this objection is necessary, since we will be dealing with a long series of 
trials. Even our newspaper vendor will presumably be trying to sell 
his papers day after day under the same circumstances, so that min- 
imizing his expected loss for one day is equivalent to minimizing his 
average loss per day over all the days he will be selling papers. In those 
cases where there will not be a long series of trials, an answer to the 
objection might be that even though, in practice, the probability of an 
event in one trial is the proportion of times it would occur in a long 
series of identical trials, even if only one trial were made, the higher 
the probability of the event, the greater would be our confidence that 
the event would occur in that one trial. For example, if one were told 
that he will be executed if he draws a red card from a deck, he would 
certainly prefer to draw the card from a deck of 51 black cards and 1 
red card rather than from a regular deck of cards. 

Returning now to the newspaper vendor, we want to find how many 
newspapers he should buy from the publisher in order to minimize his 
expected loss, where his schedule of losses (from above) is—2d+-2(y—d) 
=2y—4d, and the number of customers who will seek to buy a paper 
is a chance variable with possible values of 200 or 250, each with 
probability 4. If the vendor buys 200 or fewer papers from the pub- 
lisher, all the copies will be sold to customers, none resold to the pub- 
lisher, so the loss will be minus twice the number bought from the 
publisher, namely —2y. From this it is apparent that no fewer than 
200 copies should be bought from the publisher, for the loss decreases 
as the number bought increases from zero to 200. Also, it is clear that no 
more than 250 papers should be bought, for the total in excess of 250 
will surely have to be resold to the publisher at a loss of 2 cents each. 
So the proper number to order to minimize the expected loss is between 
200 and 250 inclusive. Then the loss will be either 2y—4(200) with 
probability 4 (that is when the bus is late), or else 2y—4y = —2y with 
probability } (this is when the bus is not late, making d=y). Therefore 
the expected loss when the number bought from the publisher is be- 
tween 200 and 250 is equal to 4(2y—800) +3(—2y) which equals minus 
400 cents. Thus it turns out that the expected loss is the same for any 
order between 200 and 250 and it is greater for any other order. Just 
to be specific, let us agree that whenever more than one order will 





722 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


achieve the minimum expected loss, we will choose the smallest of the 
orders. Thus in this case we would order 200 papers. 

We shall now give another example to illustrate how this method can 
be applied to an actual inventory problem of the Navy. There are cer- 
tain rather expensive items (some costing over $100,000 each) known 
as “insurance spares” which are generally procured at the time a new 
class of ships is under construction. These spares are bought even 
though it is known that it is very unlikely that any of them will ever 
be needed and that they cannot be used on any ship except those of 
that particular class. They are procured in order to provide insurance 
against the rather serious loss which would be suffered if one of these 
spares were not available when needed. Also, the initial procurement 
of these spares is intended to be the only procurement during the life- 
time of the ships of that class because it is extremely difficult and costly 
to procure these spares at a later date. The present policy is to order 
quantities of these spares according to the following schedule: 


Total number of Number of spares 
items installed ordered 
1-4 1 
5-50 3 
51-100 3 
over 100 4 


This particular ordering policy is based on the judgment of personnel 
familiar with the expected usage rate of such technical spares and also 
experienced with procurement policies of the Navy. However, they 
will admit that the construction of such a table is largely an intelligent 
guess and that the quantities shown to be ordered cannot be justified 
objectively. On the other hand, the procedure to be given below, based 
on the previous discussion, will lead to an objective method of con- 
structing such a table. Moreover, by using this procedure the total ioss 
over a long period of time will ordinarily be less than that obtained from 
any other ordering policy. 

Let us suppose N ships are being constructed of a certain class con- 
taining an item of the type described above, for which spares cost P 
dollars each. Let p; represent the probability that exactly 7 spares will 
be needed as replacements during the lifetime of the N ships; that is, 
~1 is the probability that exactly one spare will be needed, pe is the 
probability that exactly two spares will be needed, etc. Let us also as- 
sume that the probability of 5 or more spares being needed is zero. Let 
L dollars be the loss (usually quite large) suffered for each spare that is 





DR 1953 


of the 


d can 
e cer- 
nown 
4 new 
even 
| ever 
se of 
rance 
these 
ment 
» life- 
ostly 
order 


tHE INVENTORY PROBLEM 723 


needed when there is none available in stock. In obtaining the schedule 
of losses we shall neglect all the smaller losses and include only the cost 
of the spares (which become worthless if never used) and the depletion 
loss occurring when spares are needed but not available. Then for any 
d<y, there is no loss from depletion, so the schedule of losses would 
simply be the number bought multiplied by the unit price which is yP. 
For d=y, the schedule of losses would be yP+(d—y)L because (d—y) 
is the number of spares needed but not available, and L is the loss in- 
curred for each one short. Hence, the schedule of losses, which is de- 
noted by W(y, d), is 
W(y, d)=yP fordsy 
=yP + (d — y)L for dy. 


Now let us get the expected value of the loss, EW(y, d), for each value 
of y from y=0 to y=4. These are the only values of y which need be 
considered because if y is greater than 4, the loss will surely be greater 
than the loss for y=4. For y=0 we have that d2y for all possible 
values of d, hence 


EW y, d) = 0-P + [Prob. d = 0](0 — 0)L + [Prob. d = 1](1 — 0)L 
+ [Prob. d = 2](2 — 0)L + [Prob. d = 3](3 — 0)L 
+ [Prob. d = 4](4 — 0)L 
= (pi + 2p. + 3p3 + 4p.) L for y = 0. 
In a similar manner, we get 
EW y, d) = P + (pe + 2ps + 3p4)L fory = 1 
= 2P + (ps + 2m)L fory = 2 
= 3P+ mL fory = 3 
= 4P for y = 4. 
Now for any given values of P, L, and pi, it is a simple matter to tabu- 
late the values of EW(y, d) for the different values of y in order to de- 
termine which value of y gives the smallest expected loss. For example, 
suppose P=$100,000, L=$10,000,000, p:=.04, po=.01, ps=.001, pu 
= 0002, and the probability of 5 or more spares being needed is zero 
(hence po=.9488), then 


for y=0, EW(y, d) = (.04+.02-+.003-+.0008) (10,000,000) = $638,000 
for y=1, EW(y, 2) =100,000-+ (.01-++.002-+.0006) (10,000,000) 
= $226,000 





724 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 
for y=2, EW(y, d) =200,000+ (.001+.0004) (10,000,000) = $214,000 
for y=3, EW(y, d) =300,000+ (.0002) (10,000,000) = $302,000 

for y=4, EW(y, d) =$400,000. 


Since under the above assumed conditions the expected loss is smallest 
when y= 2, the best ordering policy is to order 2 spares. 

The newspaper vendor’s problem and the Navy inventory problem 
that we have discussed, simple as they are, contain the most important 
elements of all the other problems we shall discuss. We now give a more 
general formulation of essentially the same problem. 

Suppose that at the beginning of a certain time period we have a 
stock, z, of a certain commodity, which we shall call stock before or- 
dering. During the period a certain demand, d, for the commodity will 
be observed. This demand is a chance variable whose probability dis- 
tribution is known to us. The probability distribution of demand may 
depend upon the stock before ordering and the size of our order, but 
once these two quantities are known, the distribution of demand is 
known. To prepare for this demand, we have the privilege of ordering 
more of the commodity from the producer, ordering no more, or re- 
turning some, but this must be done at the beginning of the period. No 
orders can be delivered or stock returned once the period has started. 
We shall assume that there is no time lag in delivery from or to the 
supplier. Our problem is to find the quantity we should order, which 
will be denoted by y—z. Thus y is the quantity on hand at the start 
of the time period but after ordering. A return of goods to the producer 
is a negative order. In all practical cases there will be certain limits on 
the size of orders that can be placed, and our solution of the problem 
will take this into account. Our schedule of losses tells us what our loss 
is for each possible combination of values of the stock before ordering, 
order, and demand. In the newspaper vendor’s problem and in the 
Navy inventory problem the stock before ordering was zero which is 
why the z did not appear in the schedule of losses. 

In general we will choose that size of order such that the expected 
loss is minimized. The size of order that minimizes the expected loss 
will depend upon the size of stock before ordering. An “ordering policy” 
is a schedule showing what size of order to use for any given size of 
stock before ordering. The ordering policy is the complete solution to 
our problem, for it tells us just what to do in any given circumstances. 

A-simple example will illustrate the ideas we have been discussing. 
Suppose the proprietor of a newsstand has z copies of a monthly maga- 





a 


on =e at ao = th ere wm lot lel eS lUmhlUmelCU. 


THE INVENTORY PROBLEM 725 


gine on hand at the end of the 10th of the month, for which he has al- 
ready paid. The wholesale magazine dealer is coming the next morning 
and will take back as many magazines as the vendor desires to return, 
paying the vendor 4 cents each, or he will sell the vendor any number 
of additional copies at 6 cents each. The vendor charges his customers 
15 cents per copy. The wholesaler will not return until the end of the 
month, at which time he will buy back all unsold magazines at 2 cents 
per copy. We assume that the number of people who will attempt to 
buy a copy from him during the remainder of the month will be either 
4 or 5 with probabilities of 3 and } respectively. Also, if the vendor 
has any unsold copies at the end of the month, he will definitely sell 
them back to the wholesaler. What should the ordering policy be in 
this case? The schedule of losses which shall be denoted by W(z, y, d) 
because it depends on the value of z, on the ordering quantity, y—z, 
and on the demand, d, is obtained in the following way: 

When dz2y, y copies will be sold which will bring the vendor l5y 
cents. In addition, if ySz, then (x—y) copies are returned to the 
wholesaler which brings the vendor 4(z—y) cents making 15y+4(z—y) 
his total gain, and if y2z, the vendor purchases an additional (y—z) 
copies making his total gain 15y—6(y—2). The negative of these gains 
are the losses, yielding 


W(x, y, d) = — [l5y+4(¢—y)] = —lly—4¢ ford2z=y,ysz 
= — [l5y —6(y¥—2)] = —9y-—6r ford2=y,y2-x. 
When dSy, we have a situation similar to the above except that d cop- 
ies are sold by the vendor instead of y copies and (y—d) copies are re- 
turned to the wholesaler at the end of the month at 2 cents each, yield- 
ing 
W(x, y,d) = — [15d + 4(x — y) + 2(y — d)] 
= — 13d — 4x + 2y ford sSy,ySz2z 
= — [15d — 6(y — z) + 2(y — d)] 
— 13d — 62 + 4y ford Sy, y 2 7. 
Now we need the expressions for the expected value of the loss, 
EW(cz, y, d), from which for any given x, we will be able to find the y 
(hence the order quantity, y—x) which will minimize the expected 
loss. For y$4, the demand is certainly equal to or greater than y, 


and since in this case W(z, y, d) does not depend on the value of d, we 
have 





726 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


EW(z,y,d@)=-—l1ly—4¢ forysS4,y<z 
= — 9y — 62 fory S$ 4,y 2x. 


Clearly the expected value of the loss is minimized in this case when y 
is made as large as possible, which is 4. On the other hand, for y=5, the 
demand is equal to or less than y, and we have 


EW(z, y, d) = — 13[3(4) + 3(5)] — 4x + 2y 
— api — 42 + 2y fory25,y <2 

= — 13[3(4) + 3(5)] — 6x + 4y 
= — 231 — 6x + 4y fory = 5,y 27. 


Here the expected value of the loss is minimized by making y as small 
as possible, which is 5. Hence the ordering policy is certain to call for 
y=4 or y=5, which was fairly obvious anyway from the fact that the 
demand could be only 4 or 5. However, the above expressions for 
EW (cz, y, d) are needed to determine when y=4 and when y=5. Sup- 
pose «<4, then the expected loss for y=4 is —36—6z and for y=5 it is 
—1+4—6z which is greater than —36—6z; henec the best policy when 
x4 is to order up to 4. Now suppose +25, then the expected loss for 
y=4 is —44—4z and for y=5, it is —+44+— 42 which is less than —44 
—4z; hence the best policy now is to return all those over 5. To sum- 
marize the above, the best policy for the vendor is to buy up to 4 if he 
has less than 4 on hand, or to do nothing if he has 4 or 5, or to return 
any excess over 5 on hand. 

Next we shall discuss a more general problem—the case where there 
are several time intervals with carry-over of stock from one time in- 
terval to the next time interval. Here we are given a certain number 
of time intervals, and stock may be ordered or returned at the begin- 
ning of any of the intervals, but at no other times. Unused stock at the 
end of an interval may be kept for use in the next interval, with addi- 
tional stock ordered from the supplier if desired, or some or all of it 
may be returned to the supplier. Only the stock available at the begin- 
ning of an interval may be used to supply demand arising in that inter- 
val, and for the present we assume instantaneous delivery of orders 
from the supplier. The total loss is the sum of losses suffered in each of 
the intervals, and the different intervals may have different loss sched- 
ules. Furthermore, the loss in any time interval may depend on the 
whole “past history”—defined as all the stocks, orders, and demands 
in all the preceding intervals—as well as on the stock before ordering, 
order, and demand of the interval itself. Also, the probability distribu- 





THE INVENTORY PROBLEM 727 


tion of the demand that will be observed in an interval may depend on 
the past history as well as on the stock before ordering and order of the 
interval itself. Once the past history and the stock before ordering and 
order of the interval are known, the probability distribution of the de- 
mand that will be observed in the interval is completely known. Pre- 
sumably we are interested in minimizing the expectation of the present 
value of the total loss, and therefore will apply the proper discounting 
factors to the losses incurred in the various intervals in order to get the 
present values of those losses. These discounting factors can be as- 
sumed to be incorporated into the loss functions for the various inter- 
vals. 

A complete ordering policy in the case of many intervals must specify 
just how large we should make the order at the beginning of each in- 
terval in the light of the knowledge we possess at the beginning of the 
interval (i.e. our knowledge of the stocks, orders, and demands in the 
preceding intervals and the stock before ordering of the interval itself). 
In general the order we place at the beginning of any interval will de- 
pend upon the past history, and different past histories will require dif- 
ferent orders. In constructing an ordering policy, it is important to 
remember that once we have reached the beginning of an interval, the 
losses we have suffered in the preceding intervals are now beyond our 
control, so it is only the expected losses from the remaining intervals 
that we worry about. 

For the case of many intervals, we now give a method of constructing 
an ordering policy that makes the expected loss as small as possible. 
First we specify how much to order at the beginning of the last inter- 
val. This is simple, for there is only one interval left to worry about, 
and we want to make the expected loss in that one interval as small as 
possible. At the beginning of the last interval we know all that has hap- 
pened in the preceding intervals, and therefore we know the schedule 
of losses and the probability distribution of demand in the last inter- 
val. Thus we have essentially the problem of making the expected loss 
in one interval as small as possible, and we have discussed this prob- 
lem of one interval above. In other words, the problem of how much to 
order at the beginning of the last interval is a simple one-interval prob- 
lem, which we know how to solve. Thus, for all conceivable past his- 
tories we can make up a schedule showing how much to order at the 
beginning of the last interval. 

Now we specify how much to order at the beginning of the next-to- 
the-last interval. Once we know the past history before this next-to- 
the-last interval, then for any particular order we place at the begin- 











728 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 193 


ning of the next-to-the-last interval, we can compute the total expected 
loss in the two last intervals. The reason we can do this is that we have 
already specified what order we are going to place at the beginning of 
the last interval, under any conceivable circumstances. The total ex- 
pected loss in the two last intervals will, of course, depend on the order 
placed at the beginning of the next-to-the-last interval, so we simply 
pick that order that makes the total expected loss in the two last inter- 
vals as small as possible. This order will in general depend on the past 
history before the next-to-last interval. So far, then, we have specified 
how much to order at the beginning of the last interval and at the be- 
ginning of the next-to-last interval. 

Next we specify how much to order at the beginning of the third 
interval from the end. Once we know the past history before that inter- 
val, then for any particular order we place at the beginning of the in- 
terval, we can compute the total expected loss in the last three intervals. 
The reason is that we have already specified how much we will order 
at the beginning of the last two intervals under any conceivable cir- 
cumstances. We place that order at the beginning of the third interval 
from the end that makes the total expected loss in the three last inter- 
vals as small as possible. Thus, we have specified how much to order 
at the beginning of the last three intervals. 

And so we work our way back, interval by interval, until we have 
specified how much to order at the beginning of the first interval. Once 
we reach this point, our problem is solved, for we know how much to 
order at the beginning of each interval to make the total expected loss 
as small as possible. 

An example with two time intervals will now be given to help clarify 
the discussion just completed. Let us go back to the last example with 
the newsstand proprietor and add another time interval to the problem 
by assuming that the proprietor has two possible ordering times, on the 
mornings of the 1st and 11th of the month. On the Ist he will have no 
stock on hand before ordering, but on the 11th he may have some left 
over from the quantity he purchased on the Ist and failed to sell. Let 
us also assume that all the conditions given in the previous example re- 
main unchanged and that the demand during the Ist interval is either 
10, 14, or 18 with probabilities of 4, 4, } respectively, and that the de- 
mand during the 2nd interval is independent of the demand during the 
1st interval. What we need to determine is how many copies the pro- 
prietor should order on the Ist and what ordering policy he should 
use on the 11th. No doubt there are vendors, particularly those who are 
reluctant to take risks, who would buy only 14 on the Ist in order to 








ER 1983 


pected 
e have 
ing of 
tal ex- 
2 order 
simply 
 inter- 
e past 
ecified 
he be- 


. third 
‘inter- 
he in- 
vals, 
order 
le cir- 
terval 
inter- 
order 


. have 

Once 
ich to 
d loss 


larify 
: with 
Iblem 
m. the 
ve no 
e left 
l. Let 
le re- 
sither 
ie de- 
14 the 
} pro- 
10uld 
0 are 
ler to 





{HE INVENTORY PROBLEM 729 


avoid the risk of suffering losses from returning unsold magazines to 
the wholesaler. Let us now determine what the best policy really is. 
It will be convenient to introduce the following notation: 
y: =number of magazines purchased on the Ist. 
y:=number of magazines in stock after purchasing on the 11th. 
d, =number of people attempting to buy a magazine during the Ist interval. 
d;=number of people attempting to buy a magazine during the 2nd interval. 
2, =number of magazines in stock at the end of the 1st interval. (This is equal 
to x: —d, if y:>di; otherwise it is 0.) 


To find the best ordering policy, we make believe that we have 
reached the morning of the 11th and therefore know y; and d;. We now 
need to find the value of y2 which makes the expected loss in the 2nd 
interval as small as possible with the stock on hand being z2. But the 
best policy on the 11th has already been worked out in the last example 
of the one interval case, so that policy is the best one to use on the 11th. 
For this 2nd interval we found that y2 should be 4 if z2<4 and the 
expected value of the loss is —36—6z2, and ye should be 5 if S25 and 
the expected value of the loss is —2$4—4z2. All that remains to be found, 
is how many the vendor should buy on the Ist so that the expected 
value of the losses from both intervals is minimized. Clearly he should 
buy at least 14 because he is certain to sell at least 10 during the Ist 
interval and at least 4 during the 2nd interval. Also, he should buy at 
most 18 since the demand during the Ist interval cannot exceed that 
quantity. 

If d,2%:, then the vendor would sell all y; copies at a profit of 9 
cents each. His schedule of losses during the 1st interval would then 


‘be 


W(n, d;) = 941 for d; = 41. 


He would then end the Ist interval with no stock on hand, z2.=0, and 
the optimal policy on the 11th would be to buy 4 copies giving an ex- 
pected loss of —36—622= —36 for the 2nd interval. Thus the total loss 
from both intervals is 


= 9x1 — 36 for d = Y1. 


If d: Sy, the vendor would sell only d; copies at 15 cents each and he 
would have bought y; copies at 6 cents each making the loss during the 
Ist interval 


W(y, di) = — [15d, — 6y:] = — 15d: + by for di S 41. 
He would then end the Ist interval with a stock of 2:.=y,—d;, and we 





730 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


know that the best policy then would be to make y2.=4 if z2<4 and to 
make y2=5 if x2=5. The expected loss from the 2nd interval in these 
two cases is given by 


— 36 — 62, = —36-—6(y41-—d:) fory — d,s4 


and 


— 42 -— 4m = —4H-4(y-—d:) form — d25. 
The total loss for both intervals for y,2d; would then be given by 


= 15d; + 6x" 7 36 = 6(y1 - d;) = — 36 — 9d; 
ford Sy Sa+4 


and 


—15d; + 6y1 = Apt > 4(y acai d;) = aft — lld; + 2x1 
for y, 2 d, + 5. 


We now want to compute the expected value of the total loss for » 
ranging from 14 to 18 which are the only values we need consider. We 
note that for d,=10, we have y,2d,+5 for all the values of y: except 
when y:=14 in which case we have d,;Sy;Sd,+4. For di=14, we al- 
ways have d:Sy,54d,+4, and for d,=18, we have y:Sd;. Hence the 
expressions for the expected value of the total loss are 


4(— 36 — 9(10)) + 4(— 36 — 9(14)) + 3(— 126 — 36) = — 153 
for y: = 14 


and 


4(— 2444 — 11(10) + 2x) + 3(— 36 — 9(14)) + 3(— 9m. — 36) 
= — 2991 — ty, for 15 Sy: S 18. 


By taking y= 18 we find that the expected total loss is —29¢4— (18) 
= — 160; which is the smallest we can make this expected loss. There- 
fore the best policy for the vendor is to buy 18 copies on the 1st and to 
use the policy previously given on the 11th. 

A further generalization of the inventory problem is to allow time 
lags in the delivery of orders. In other words, an order placed at the 
beginning of an interval will not arrive until a certain number, 7’, of 
intervals have passed. Otherwise the problem is the same as the type 
just discussed, and the method of solution is almost the same. Obvi- 
ously the last order will be placed (7’+1) time intervals before the end, 
since no order placed later than that will arrive in time to be of any use. 








tHE INVENTORY PROBLEM 731 


We choose the size of the last order to make the total expected loss in 
the remaining (7'+1) intervals as small as possible. The order we choose 
will depend upon the past history known at the moment the order is 
placed, and this past history will include quantities already ordered 
but which will arrive in the future. Once we have found the proper 
size for the last order, for any particular next-to-last order we can com- 
pute the expected loss in the remaining (7'+2) intervals. We choose 
the size of the next-to-last order which minimizes the expected loss in 
the remaining (7'+2) intervals. This size will, of course, depend upon 
the past history known at the moment the order is placed. And so, 
interval by interval, we work our way back to the first order. 

Another generalization is to allow simultaneous demands for several 
different types of items, necessitating the stocking of more than one 
commodity. The demands may be interrelated in any way, and some 
commodities may be partial or complete substitutes for others. It is 
assumed that, given a particular set of demands and a particular set of 
commodities on hand, we know how to use the commodities most ef- 
fectively in trying to satisfy the demands. The schedule of losses in 
this case tells us what our loss is for any given set of demands and any 
given combination of commodities available, assuming that the com- 
modities available are allocated most effectively. In this case the prob- 
ability distribution of demand is a joint probability distribution of the 
different types of demand, which gives us the probability of observing 
any particular combination of demands, and an ordering policy must 
tell how much of each commodity to order at each stage. In computing 
an ordering policy for this case, the principles are the same as in the 
single commodity case, but the details are more troublescme, and for 
a large number of items it may be practically impossible to carry out 
the computations. 

As a last generalization we have the case where the probability dis- 
tribution of demand is not completely known—we may know only that 
the distribution is of a certain type. Then, for any given ordering pol- 
icy, there will not be merely one expected loss, but a whole set of ex- 
pected losses, one for each possible distribution of demand. How then 
shall we compare two different ordering policies, since one may be bet- 
ter for some distributions of demand and worse for others? One method 
of doing this is to find, for each ordering policy, the maximum expected 
loss over all possible distributions, and then choose that ordering policy 
with the smallest maximum expected loss. This ordering policy is called 
a “minimax” policy because it minimizes the maximum expected loss. 

In the following example illustrating the minimax policy, some mathe- 











732 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


matical terminology and methods will be used which may be unfamiliar 
to some readers and can be omitted by them without too much loss. Let 
us go back to the very first example in this paper in which a newspaper 
vendor buys papers in the morning at 3 cents per copy, sells them to 
customers at 5 cents per copy, and resells unsold copies to the supplier 
at 1 cent per copy; but now we shall assume that the distribution of the 
demand, d, is normal, with standard deviation 50 and mean between 
1000 and 2000, the exact value of the mean being unknown. We want 
to find how many papers the vendor should buy in the morning accord- 
ing to the minimax policy. 
As before, the vendor’s schedule of losses is given by 


Wy, d) = 2y — 4d ford <y 
= — 2y ford = y. 


From this it is seen that for any order quantity, y, the vendor’s loss 
will be greatest when d is smallest. But as the mean of the normal dis- 
tribution of d decreases, small values of d become more probable and 
large values of d become less probable. Therefore, for any given y, the 
expected loss is greatest when the mean of the normal distribution of d 
is as small as possible, namely, 1000. Hence, if we choose the y that 
minimizes the expected loss when the mean of the distribution of d is 
1000, this will be the minimax y (the y with the smallest maximum ex- 
pected loss). For suppose we use some other y, then the maximum ex- 
pected loss using this other y will occur when the mean of the distribu- 
tion of d is 1000, and this maximum will be greater than if we had used 
the y that minimizes the expected loss when the mean of the distribu- 
tion of d is 1000. To find the minimax y, let F(x) denote the normal 
cumulative probability distribution function with mean 1000 and 
standard deviation 50, and f(x) the corresponding density function. 
Then the expected loss (assuming the mean of d is 1000) is equal to 


" tf(tdt 
Qy — 4—————_- | F(y) — 2y[1 — F 
y ro (y) — 2y[ (y)J 
= — 2y — 4 f “ese + 4yF(y). 


Differentiating with respect to y, we get [—2+4F(y) ]. This derivative 
is zero for F(y)=1/2, negative for F(y)<1/2, positive for F(y)>1/2. 
Therefore we should take y so that F(y) is equal to 1/2, which means y 
should be equal to 1000. This is the minimax y. 





R 1983 
miliar 
8. Let 
paper 
m to 
plier 
of the 
ween 
want 


CENSUS TRACTS AND URBAN RESEARCH* 


Donatp L. Fo.tey 
University of California (Berkeley) 


ELECTED statistics have been reported on a census tract basis by the 

Census Bureau for the past four decennial censuses. The number of 
tracted cities has increased during this period from 10 to 72. In short, 
the census tract statistical reporting system has become a well developed 
source of information. 

At recent Census Tract Conferences most of the discussion has cen- 
tered on applied uses of tract data. Thus representatives from business, 
market research [1], city planning and various social and health agen- 
cies have reported on putting census tracts to work. This paper supple- 
ments these earlier reports (1) by examining how census tract statistics 
have facilitated urban research of a more theoretical sort, (2) by dis- 
cussing some methodological problems that have been encountered, 
and (3) by suggesting ways in which census tracts can most effectively 
implement such research in the future. The focus here will tend toward 
pure rather than applied, and toward university rather than business or 
civic agency research. 


THE USE OF TRACT STATISTICS IN “PURE” URBAN RESEARCH 
In general, the tract reports issued by the Census Bureau have cen- 


tered around certain population and housing characteristics, areally 
assigned according to home address. Each category of information has 
usually been reporied in frequency distribution form, from which se- 
lected summary statistical measures (e.g., percentages or averages) can 
be computed. 

Research use of tracts is by no means limited to data reported by the 
Census Bureau. This is one of the intriguing assets of the tract report- 
ing system. Numerous and important additional types of information 
have been assembled by local agencies and researchers, although prob- 
ably more for applied than for pure research purposes [2, 3]. Thus, we 
have had tract statistics for juvenile delinquency [4, 5], receipt of wel- 
fare care [5, 6], births and deaths [5, 7], illness [5], mental illness [8], 
suicide [9], residential mobility [6, 10], etc. 

So much for an introductory look. Let us now turn to university re- 
search. In which academic fields have traci data been used in the con- 
duct of pure research? In general, the research most directly promoted 





* Presented before the Census Tract Conference, American Statistical Association, meeting in 
Chicago, December 28, 1952. At the time of presentation the author was affiliated with the University 
of Rochester. 


733 





734 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


has been that dealing with the differential characteristics of urban resi- 
dential subareas within large cities, being conveniently subsumed under 
the label, human ecology [11]. Urban sociologists, urban geographers, 
and land and real estate economists have been the most active devotees, 
while various other social scientists and scattered professionals with 
pure research interests, in such fields as municipal administration and 
business, have been peripherally linked. 

Discouraging as it may seem to proponents of the census tract sys- 
tem, a sober appraisal leaves the impression that there has been but 
limited pure research use of tract statistics and this mainly in the field 
of urban sociology. Urban geographers have relied on their own map- 
ping and descriptive skills and have generally shunned the compara- 
tive, quantitative methodology that would most logically provide a 
receptive context for using census tract data. Some social scientists, 
notably certain real estate economists, have placed greater reliance on 
census tabulations by city blocks than by census tracts [12]. In political 
science and in other branches of economics there seems to have been 
virtually no research use of the census tract system. 

What research patterns have been employed in adapting tract ma- 
terial to pure research use? An initial distinction here is between those 
studies where census tract statistics have provided the central data and 
those researches where tract figures have been used (although less spec- 
tacularly) in the selection of study districts [13] or in furnishing statis- 
tics of relatively minor importance. 

It would seem fruitful to identify six main ways in which tract sta- 
tistics have been used, viewed in methodological terms. These different 
patterns are not mutually exclusive; two or more may be interwoven 
within the same study. 

1. Descriptive use in which the differential incidence, by tract, of a 
single factor is reported. In this pattern the incidence variations can 
usually be conveniently summarized in map form [2], using what we 
may term an ecological map. Some comprehensive reports have in- 
cluded a series of such maps, reporting both census collected and lo- 
cally assembled statistics. Among the most ambitious of such reports 
are those for Minneapolis [14], Seattle [15], Cleveland [7], and Rochester 
[5, 16]. In some studies of very large cities, tracts have been combined 
to form concentric zones or sectors, with incidence rates reported ac- 
cordingly [4, 8, 17]. 

2. Descriptive use in which the cross-cutting of two or more separate 
incidence patterns is reported. In map form, this use involves either the 
comparison of two or more of the single factor maps, as prepared in use 





CENSUS TRACTS AND URBAN RESEARCH 735 


(1), or the preparation of a single map in which the cross classification 
of factors is shown with the aid of an appropriate legend [2, 18]. This 
step characteristically precedes the somewhat more sophisticated uses 
(4), (5), or (6). 

3. Time-series use in which changes, by tracts, are reported for stated 
periods of years. It is common here to summarize the findings in map 
form with legends indicating percentage increases or with time-series 
graphs fitted into the various tracts. Where statistics have been re- 
ported by concentric zones for some of the largest cities, various other 
forms of graphic analysis have been used, as in the Chicago studies of 
population succession by Cressey and Ford [19, 20]. 

4. Analysis of relationships, utilizing what has been termed ecological 
correlation [21]. Here the variables are summary measures, by census 
tracts. Thus, one can correlate per cent foreign born and median school 
years completed. In this case nativity status and education are not 
correlated directly, person by person, as in individual correlation. Usu- 
ally, in fact, in ecological correlation we do not know this information 
on @ person to person basis. Studies of this type have been conducted 
in Chicago [4, 8, 22], St. Louis [6, 23], and other cities [5, 24]. 

5. The interpretation of individuals’ characteristics in terms of the 
general social environment of the tract. In this case the former emerge 
from the specific study while the latter is available in the form of pre- 
viously published tract statistics. Faris and Dunham [8], for example, 
utilized this design to demonstrate that mental illness rates were higher 
for Negroes and certain other groupings in areas (combinations of tracts) 
not primarily populated by their own members. 

6. Use in statistical index form, each index presumed to represent a 
cluster of factors. Thus, average rental [7] and median education [25] 
have been promoted as indices of socio-economic status. A challenging 
recent attempt to develop statistical indices is the work by Shevky and 
Williams using Los Angeles census tract data [18] in developing three 
indices: for social rank (roughly socio-economic status), for urbaniza- 
tion (a complex of factors relating to type of family life), and for segre- 
gation (the residential concentration of minority groups). Based on the 
alternate ways in which these three indices can be related to each 
other, the authors have suggested a typology of residential areas. Alter- 
nate segregation indices have also been suggested by other researchers 
[26, 27, 28]. Kendall and Lazarsfeld have presented a stimulating dis- 
cussion of the various types of indices usable at a tract level according 
to the alternative logical ways by which they relate to direct charac- 
terizations of the individuals included [29, pp. 187-196]. 





736 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


SOME METHODOLOGICAL PROBLEMS 


The research use of census tract data has involved a series of ecologi- 
cal and statistical assumptions, some of which have been reexamined 
during recent years. It should be recognized that the most vigorous ex- 
pansion of the census tract system tended to coincide with the phe- 
nomenal rise of the ecological “school” of social research. By now, the 
sociologist’s intellectual honeymoon with urban ecology is over and he 
is faced with the problem of settling down and living with this ecologi- 
cal approach. Such scholars of ecology as Hollingshead and Hawley [30] 
have in recent years identified theoretical difficulties inherent in “clas- 
sical” ecology and have indicated considerable skepticism regarding 
the future utility of spatial analysis, narrowly conceived. 

Let us examine some of the more specific assumptions that have been 
implicit in ecological research using census tract data: 

1. It was assumed during urban ecology’s early years that the large 
city was divided into “natural areas.” There was some belief that cen- 
sus tracts could be so established that they would coincide closely with 
these natural areas. Thus, internal homogeneity was sought and as- 
sumed for each tract [2, 31, 32]. The utility of the natural area concept 
has since been questioned by Hatt (33, 34] and the usefulness of data 
from non-homogeneous tracts has been challenged by a number of re- 
searchers [15, Appendix B; 35]. Myers, for example, in a recent study 
concluded that of New Haven’s 28 census tracts “10 tracts are homo- 
geneous to a remarkable extent; seven are less homogeneous; while the 
remaining 11 are heterogeneous” [36]. 

2. With Burgess’ important concentric zone construct, it appeared 
likely that general principles of urban spatial patterning would emerge, 
embodied in an ecological theory of urban structure. It has become in- 
creasingly evident, however, that at best alternative constructs must 
be admitted, such as the Hoyt sector theory and Davie’s insistence on 
the industrial pattern’s primacy. A more pessimistic view concludes 
that for many cities historical or topographic factors have had so per- 
vasive an influence as seriously to limit the predictive value of the 
broader principles. So while some cities (Chicago, St. Louis, Rochester 
[37]) tend to uphold much of Burgess’ and/or Hoyt’s theories, other 
cities (Boston [38], Pittsburgh, New York, Flint [39]) have more com- 
plex patterns. We may eventually need to introduce a typological sys- 
tem such as Shevky’s that will be less geared to grand principles and 
more to identifying certain types of urban areas in whatever overall 
pattern they take. Where Burgess’ scheme has not proved applicable, 





CENSUS TRACTS AND URBAN RESEARCH 737 


the use of concentric mile zones, gradients, and similar ecological tech- 
niques tend to lose much of their utility. 

3. In many research projects it has been assumed that ecological in- 
dices were valid measures of certain social phenomena. In the study of 
juvenile delinquency, for example, the number of boys brought before 
a juvenile court or other agency (and expressed as a rate) has been used 
as an index of delinquency [4]. In the 1940 Census enumeration, the 
number or per cent of dwelling units “needing major repair” was avail- 
able as an index of housing condition.' In 1950 a “dilapidated” cate- 
gory was introduced. But we have had relatively little systematic vali- 
dation of these indices. The work by Schmid in this connection is im- 
portant. Using 1940 tract statistics from 20 medium-sized cities, he 
examined the degree to which a single index, such as educational level 
or rental level, is a valid measure of a larger complex of factors [25]. 

4. There has been some tendency for researchers rather uncritically 
to accept census tract statistics as reliable. There are conditions, how- 
ever, under which one should recognize that a sampling error may be 
present, particularly where the population base for the tract is small. 
This problem was recognized rather early in the development of the 
census tract system by various statisticians [40, 41, 42, 43], but it is not 
certain that all other users of tract statistics have heeded the cautions. 
Now in the 1950 census tract reports the problem has been reopened by 
the Census’ reliance on a 20 per cent sample for some nine published 
tract tabulations. This has resulted in such potentially important sta- 
tistical indices as years of schooling and family income. now being sub- 
ject to sampling error.? 

5. In the impressive series of studies that have used ecological cor- 
relation it has been assumed that correlations demonstrated meaning- 
ful interrelations of factors. Certain scholars in the 1930s [44, 45] and a 
recent vigorous article by Robinson [21] have pointed to serious statis- 
tical difficulties implicit in ecological correlation. Robinson concludes 
that “ .. . the only reasonable assumption is that an ecological correla- 
tion is almost certainly not equal to its corresponding individual cor- 
relation. [21, p. 357]. These critics have thus shown not only that eco- 
logical correlations run higher than individual correlations, but that 
the fewer the ecological areas, the higher the correlations. Hence a cor- 





1 As a matter of fact, this index did not prove to be consistently valid when used in research in 
St. Louis. This was apparently related to the subjectivity involved in its enumeration, 

2 This author is indebted to Professor Calvin Schmid for his emphasis on this problem. After 
Schmid’s methodological research tended to validate the use of median schoot years completed as an 
important index {25}, we now find that in the 1950 census reports the utility of this statistical indicator 
is somewhat reduced. 





738 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


relation for City X based on 20 large districts will turn out to be larger 
than one based on 85 smaller districts, say, census tracts. Menzel, on 
the other hand has pleaded the case for ecological correlation where it 
is clearly understood that the characteristics being correlated are 
meaningfully interpretable in areal as well as (or instead of) in individ- 
ual terms [46]. 

6. In most earlier ecological research a rather static approach was 
assumed. Hence, to study urban structure it was necessary to have 
statistics that related only to the individual in his tract of residence at 
the time of the enumeration. No information, for example, was pro- 
vided on home-work or home-shopping spatial relations. With rare 
exceptions (Cleveland statistics for several years during the 1930s 
[10]), we have lacked information on intertract residential mobility 
within a metropolitan region. We have had no usable information as to 
residents’ association memberships and psychological identifications. 


PROMISING FUTURE USES OF CENSUS TRACT DATA 


It now seems appropriate to summarize what appear to be some of 
the most fruitful continuing uses for census tract statistics in pure ur- 
ban research. For in spite of the skeptical tone in which a number of 
the above points have been phrased, it is apparent that the tract re- 
porting system, when judiciously utilized, fills a striking need and cer- 


tainly deserves to be maintained. It is far more economical for Census 
reporting and more convenient for a variety of research applications 
than is reporting on a block basis. For the largest cities it provides a 
workable unit by which statistics can also be assembled for even larger 
areas or districts. 

An initial recommendation is that researchers in such academic fields 
as geography, political science, and social psychology be “educated” to 
the potential research adaptations of the census tract system. For ex- 
ample, the author recently overheard a political scientist admitting 
ignorance of census tract data, when questioned by a fellow sociologist. 
Nor had this political scientist heard of the recent study by Salmon and 
Olds of St. Louis voting behavior [23]. This scholar in the field of po- 
litical behavior showed considerable interest in the fact that tract sta- 
tistics could often be combined into ward statistics making possible 
ecological correlations between voting behavior and various social 
characteristics. 

A second suggestion is that census tract data may have their great- 
est general research value in providing rough ecological profiles. It 





CENSUS TRACTS AND URBAN RESEARCH 739 


would be misleading to promise too much in the name of census tract 
statistics. They do not, for example, offer the areal refinement available 
from block data. Then, too, tract statistics are typically encumbered 
with certain limitations inherent in their functioning as statistical in- 
dices. There would seem to be a continuing need for guidance to poten- 
tial users. 

A third proposal flows from the second: that the use of tract statistics 
be integrated with other research approaches. An analysis of tract in- 
formation or the plotting of tract statistics on work maps can some- 
times be helpful at exploratory levels of research. Statistical profiles of 
particular sections of a city provide an excellent backdrop for non- 
quantitative case-study types of analyses. In Stephan’s words (dating 
from the mid 1930s), “Census tract research will probably be most ef- 
fective when considered not as a method of study complete in itself 
but as one step in a sequence of investigations” [39, p. 166 Suppl.]. 

Fourth, ecological correlations should be used only if it is clearly un- 
derstood that they tend to relate characteristics of areal units and that 
they are not adequate substitutes for individual correlation. If, for ex- 
ample, a researcher wants to study the correlates of the incidence of 
mental illness, he should recognize the methodological alternative of 
directly exploring the background characteristics of persons who are ill. 
The researcher should also take into account the effect of tract or areal 
unit size on the magnitude of the resulting ecological correlation. 

Fifth, there is a continuing need for ingenuity in introducing new 
types and forms of tract information. At the University of Miami, 
Wolff has been developing a technique for forecasting population by 
census tracts [47]. With many of the largest cities now having a back- 
ground of three or four decades of census statistics, such analysis of in- 
ternal population trends may become increasingly feasible. 

Under the sponsorship of the Social Science Research Council, the 
Pacific Coast Committee on Community Studies (Leonard Broom, 
Chairman) is currently preparing a research memorandum that will 
include several methodological contributions [48]. Schmid has been re- 
fining an approach whereby the Guttman scaling technique may be ap- 
plied to census tract data in an attempt to type residential areas. 
Robinson, Broom, Shevky, and Bell have all been engaged in further 
developing and testing areal typologies. These researches will be in- 
cluded in the Committee’s memorandum. One other recent West Coast 
attempt at developing urban subcultural areas is Wann’s research at 
the University of California [49]. 











740 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


There would seem to be a continuing need for more measures, on a 
tract basis, of residential mobility and various cummuting and activity 
patterns. The brilliant theoretical study of residential mobility by 
Stouffer [50] was only possible because of Green’s unique assembly of 
intertract residence shifts [10]. If it were possible to replicate these 
statistics in other cities or to devise similar cross tabulations, by tracts, 
on home-to-work or on home-to-other-activity movements, our under- 
standing of daily population movements and of dependency on com- 
munity facilities could be enhanced. The coding of certain information 
by tract of employment or of shopping might be a helpful variant. 

And, finally, it seems appropriate to stress the need within each large 
city for effective communication among researchers so as to maximize 
the chances that data and methods from one study will have by-prod- 
uct value for succeeding studies. Base maps, street indexes, and certain 
arrangements for filing and interchanging data should be provided. 
The highly ingenious punched card system developed for St. Louis by 
Olds [51], although built around the city block as the basic unit, may 
have certain applicability on a tract or a block-and-tract basis for other 
cities. 

REFERENCES 

[1] Cunningham, Ross M., “Evaluation of census tracts,” Journal of Marketing: 
15 (1951), 463-70. 

[2] U. S. Bureau of the Census, Census Tract Manual, Washington, January 
1947 (Third edition, revised and enlarged). 

[3] Reed, Vergil D., “Business uses of data by census tracts and blocks,” 
Journal of the American Statistical Association, 37 (1942), 238-46. 

[4] Shaw, Clifford, and McKay, Henry D., Juvenile Delinquency and Urban 
Areas. Chicago: University of Chicago Press, 1942. 

[5] Koos, Earl Lomon, Rochester, New York, III: An Atlas of the Ecological 
Patterns of the City’s Social Problems. Rochester: Council of Social Agencies, 
1944, 

[6] Fletcher, Ralph C., Hornback, Harry L., and Queen, Stuart A., Social 
Statistics of St. Louis by Census Tracts. St. Louis: School of Business and 
Public Administration, Washington University, January 1935. 

[7] Green, Howard Whipple, Population Characteristics by Census Tracts, 
Cleveland, Ohio, 1980. Cleveland: Cleveland Plain Dealer, 1931. 

[8] Faris, Robert E. L., and Dunham, H. Warren, Mental Disorders in Urban 
Areas. Chicago: University of Chicago Press, 1939. 

[9] Porterfield, Austin L., “Suicide and crime in the social structure of an 
urban setting: Fort Worth, 1930-1950,” American Sociological Review, 17 
(1952), 341-49. 

[10] Green, Howard Whipple, Movements of Families within the Cleveland Metro- 
politan District. Cleveland: Real Property Inventory of the Metropolitan 
District, Report No. 7, 1936. 

[11] Quinn, James A., “Topical summary of current literature on human ecol- 
ogy,” American Journal of Sociology, 46 (1940), 191-226. 

















+ & @ 


~~ = = & 


I 











CENSUS TRACTS AND URBAN RESEARCH 741 


[12] Federal Housing Administration, The Structure and Growth of Residential 
Neighborhoods in American Cities (by Homer Hoyt), Washington: U. §. 
Government Printing Office, 1939. 

[13] Cohen, Lillian, “Los Angeles rooming-house kaleidoscope,” American 
Sociological Review, 16 (1951), 316-26. 

[14] Schmid, Calvin F., Social Saga of Two Cities. Minneapolis: Minneapolis 
Council of Social Agencies, 1937. 

[15] Schmid, Calvin F., Social Trends in Seattle. Seattle: University of Washing- 
ton Publications in the Social Sciences, Vol. 14, October 1944. 

[16] Koos, Earl Lomon, Rochester, New York, 1940: An Atlas of Population 
Variables by Census Tracts. Rochester: Council of Social Agencies, 1943. 

{17] Frazier, E. Franklin, “Negro Harlem: an ecological study,” American 
Journal of Sociology, 43 (1937), 72-88. 

[18] Shevky, Eshref, and Williams, Marilyn, The Social Areas of Los Angeles. 
Berkeley: University of California Press, 1949. 

[19] Cressey, Paul F., “Population succession in Chicago: 1898-1930,” American 
Journal of Sociology, 44 (1938), 59-69. 

{20} Ford, Richard G., “Population succession in Chicago,” American Journal 
of Sociology, 56 (1950), 156-60. 

[21] Robinson, W. S., “Ecological correlations and the behavior of individuals,” 
American Sociological Review, 15 (1950), 351-57. 

[22] Thompson, Warren S., and Ruth, Nelle J., “Ratios of children to women in 
Chicago and Cleveland census tracts,” American Sociological Review, 4 
(1939), 773-91. 

[23] Salmon, David W., and Olds, Edward B., St. Louis Voting Behavior Study. 
St. Louis: Metropolitan St. Louis Census Committee of the St. Louis 
Chapter, American Statistical Association, 1949. 

[24] Thompson, Warren 8., “Some factors influencing the ratios of children to 
women in American cities, 1930,” American Journal of Sociology, 45 (1939), 
183-99. 

[25] Schmid, Calvin F., “Generalizations concerning the ecology of the American 
city,” American Sociological Review, 15 (1950), 264-81. 

[26] Jahn, Julius A., Schmid, Calvin F., and Schrag, Clarence, “The measure- 
ment of ecological segregation,” American Sociological Review, 12 (1947), 
293-303. 

[27] Jahn, Julius A., “The measurement of ecological segregation: derivation of 
an index based on the criterion of reproducibility,” American Sociological 
Review, 15 (1950), 100—4. 

[28] Cowgill, Donald O., and Cowgill, Mary S., “An index of segregation based 
on block statistics,” American Sociological Review, 16 (1951), 825-31. 

[29] Kendall, Patricia L., and Lazarsfeld, Paul F., “Problems of survey analy- 
sis,” in Robert K. Merton and Paul F. Lazarsfeld (eds.), Continuities in 
Social Research. Glencoe: The Free Press, 1950. Pp. 133-96. 

[30] Hollingshead, A. B., “Community research: development and present 
conditions,” American Sociological Review, 13 (1948), 136-46. Also discus- 
sion following, pp. 146-56, especially by Amos H. Hawley, pp. 153-56. 

[31] Leiffer, Murray H., “A method for determining local urban community 
boundaries,” Publications of the American Sociological Society, 26 (1932), 

137-43. 





742 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


[32] Schmid, Calvin F., “The theory and practice of planning census tracts,” 
Sociology and Social Research, 22 (1938), 228-38. 

[33] Hatt, Paul, “Spatial patterns in a polyethnic area,” American Sociological 
Review, 10 (1945), 352-56. 

[34] Hatt, Paul, “The concept of natural area,” American Sociological Review, 11 
(1946), 423-27. 

[35] Young, Kimball, Gillin, John L., and Dedrick, Calvert L., The Madison 
Community. Madison: University of Wisconsin Press, 1934. 

[36] Myers, Jerome K., “The homogeneity of census tracts: a methodological 
problem in urban ecological research,” American Sociological Review, forth- 
coming. 

[37] Bowers, Raymond W., “Ecological patterning of Rochester, New York,” 
American Sociological Review, 4 (1939), 180-89. 

[38] Rodwin, Lloyd, “The theory of residential growth and structure: evaluation 
of prevailing theories,” Appraisal Journal, 18 (1950, 295-317. Also com- 
ments by Homer Hoyt and Walter Firey, with rejoinder by Rodwin, Jbid., 
18 (1950), 445-57. 

[39] Kantnor, John, The Relationship between Accessibility and Socio-Economic 
Status of Residential Lands, Flint, Michigan. Ann Arbor: University of 
Michigan, March 1948. 

[40] Stephan, Frederick F., “Sampling errors and interpretations of social data 
ordered in time and space,” Journal of the American Statistical Association, 
29 (1934), 165-66. 

[41] Chaddock, Robert E., “The significance of infant mortality rates for small 
geographic areas,” Journal of the American Statistical Association, 29 (1934), 
243-49. 

[42] Ross, Frank A., “Ecology and the statistical method,” American Journal 
of Sociology, 38 (1933), 507-22. 

[43] Queen, Stuart A., “The ecological study of mental disorders,” American 
Sociological Review, 5 (1940), 201-9. 

[44] Gehlke, C. E., and Biehl, Katherine, “Certain effects of grouping upon the 
size of the correlation coefficient in census tract material,” Journal of the 
American Statistical Association, 29 (1934), 169-70. 

[45] Neprash, J. A., “Some problems in the correlation of spatially distributed 
variables,” Journal of the American Statistical Association, 29 (1934), 167-68. 

[46] Menzel, Herbert, “Communication on Robinson’s ’Ecological correlation 
and the behavior of individuals,’ ” American Sociological Review, 15 (1950), 
674. 

[47] Wolff, Reinhold P., “The forecasting of population by census tracts in an 
urban area,” Land Economics, forthcoming. 

[48] Letter to author from Professor Leonard Broom dated October 17, 1952. 

[49] Wann, Trenton W., “Objective determination of urban sub-culture areas,” 
Unpublished Ph.D. dissertation, Department of Psychology, University of 
California (Berkeley), 1949. 

[50] Stouffer, Samuel A., “Intervening opportunities: a theory relating mobility 
and distance,” American Sociological Review, 5 (1940), 845-67. 

[51] Olds, Edward B., “The city block as a unit for recording and analyzing 
urban data,” Journal of the American Statistical Association, 44 (1949), 
485-500. 





ON A PROBABILITY MECHANISM TO ATTAIN AN 
ECONOMIC BALANCE BETWEEN THE RE- 
SULTANT ERROR OF RESPONSE AND 
THE BIAS OF NONRESPONSE 


W. Epwarps DEMING 
New York University 


The author postulates a probability mechanism for the 
simultaneous production of the bias of nonresponse and for the 
variance of response. The nonresponse arises from a graded 
series of classes of the members of the universe to be 
sampled. The classes range from an impregnable core of no 
possible response, on up to a class of complete response. 
Nonresponse arises from two sources, not at home, and re- 
fusal. Refusals are of two kinds, permanent and temporary. 
The variation in the amount of time spent at home, and the 
variation in the firmness of the temporary refusal, produce the 
graded series of classes. The bias of nonresponse arises from 
the variation of any characteristic from one class to another. 
The variance of response arises from the variation of any 
characteristics from one member to another within a single 
class, and from the random variation in the number of re- 
sponses therefrom. 

An increase in the size of the initial sample or a more 
efficient method of selection will decrease the variance of 
response, but will have no effect on the bias of nonresponse. 
Successive recalls, on the other hand, decrease the bias of 
response, and are more effective than an increase in the size 
of the sample or a more efficient method of selection in de- 
creasing the root-mean-square error which arises from both 
nonresponse and from the variation of response. 

The results show that without recalls, it is hazardous to 
put any confidence in the result, no matter how big the sample, 
even when the variation in the measured characteristic is only 
two-fold from the class of lowest response to the class of 
highest response. 

With the levels of response assumed here (taken from aver- 
age urban experience), and with an estimate formed by 
summing up the initial call and the recalls, the first two recalls 
effect together about a 50% reduction in the initial bias of 
nonresponse. Further recalls continue to be productive. In 
fact, with this method of estimation, each recall added to a 
sampling plan, even to six recalls, actually increases the 
amount of information obtained for each dollar expended on 
interviewing. 

Even with three recalls, and with only a two-fold variation 
from the class of lowest response to the class of highest re- 
sponse, an initial sample bigger than the equivalent of from 


743 











744 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


300 to 500 binomial cases in any one subclass is ineffective and 
uneconomical. The apparent precision of a bigger sample is a 
delusion, as with bigger samples the bias of nonresponse will 
eclipse the error of sampling unless there are 4, 5, or more 
recalls. An attempted “complete count” is no exception and 
often represents an extreme waste of effort. 

For high accuracy, @ plan that uses the ordinary method of 
estimation by combining the initial attempt and the recalls 
must support 4, 5, or 6 recalls, along with an initial sample 
equivalent to from 800 to 1500 binomial cases. 

For any proposed survey, calculations based on rough ad- 
vance estimates of the constants that appear in the formulas 
will predict to a useful degree of approximation the biases and 
the variances to be expected from various types of plans. Fig- 
ures on costs will then point out which plan is most economical, 
of those that are possible, for the attainment of a prescribed 
accuracy. 

Where extremely high accuracy is required, the Politz plan 
with 2000 or more binomial cases becomes competitive in cost 
with a survey that depends on recalls. In any case, the 
Politz plan has the advantage of speed and of being able to 
produce results under circumstances wherein recalls are im- 
possible (for example, listening to a radio program). 

The proposed mechanism provides a theory of bias to sup- 
plement the theory of sampling. It indicates the possibility of 
new and more efficient methods of estimation than the simple 
combination of the initial attempt and the recalls, as it will 
provide a rational basis for extracting more information from 
the recalls. It will also point out, for any particular method 
of estimation, what empirical information will be helpful in the 
planning of the efficient allocation of effort amongst the initial 
sample and the recalls. 


THE AIM OF THIS RESEARCH 


ONRESPONSE in a& survey is devastating and discouraging, whether 

the survey be by mail or by interview. In careful survey-practice, 
efforts have been made in many directions to reduce it. One usual 
solution is to find ways to build up the initial response. An additional 
solution is to call on the nonresponses, and to call and call. The first 
recorded systematic plan for putting pressure on a sample of nonre- 
spondents appears to have been carried out by Maurice Leven! in 1934. 
Substitution does not help: it is only equivalent to building up the 
size of the initial sample, leaving the bias of nonresponse undiminished, 





1 Maurice Leven, The Incomes of Physicians (Chicago, 1932); pp. 12 and 13. Mr. Stanley Legergott 
of Washington called my attention to this work. With regard to the ineffectiveness of substitution, see, 
for example, Cochran, Sampling Techniques (Wiley, 1953), p. 302. 











ERROR OF RESPONSE AND BIAS OF NONRESPONSE 745 


Hansen and Hurwitz? found the optimum fraction of recalls to reach 
the minimum sampling variance for a fixed total cost, the estimate being 
formed, as usual, by pooling the initial responses and the recalls. Birn- 
baum and Sirken* apparently sought to minimize the mean square 
error that arises from both the variance of the responses and from 
failure to obtain interviews for any reason, including the permanent 
refusals (this I judge from their “nonresponding groups responding yes” 
—not clear to this author). Houseman‘ presented new results on the 
total bias that may arise from different classes of nonresponse. A new 
approach, in surveys conducted by interviews, is the Politz plan,® 
in which only the temporary refusals require recalls, as the correction 
for people not found at home while the interviewer is in the area is 
made by classifying the respondents according to the chance of finding 
them at home, and then by weighting the responses accordingly. 

It turns out that the bias of nonresponse is probably so serious in 
many if not most surveys that the specification of the number of recalls, 
and the adjustment of the original size of the sample to permit either 
the use of the Politz plan or the requisite number of recalls to balance 
the bias of nonresponse against the variance, and to stay within the 
allowable budget, are an essential part of sample-design where the aim 
is to produce as much information as possible per unit cost. 

The purpose of this paper is (a) to study the evidence produced by 
a proposed mechanism that will give rise to a calculable variance, to 
a calculable bias of nonresponse, and to a calculable cost; (b) on the 
basis of this mechanism to make a determination of the number of re- 
calls that are required to reach a desired accuracy at minimum cost. 
The allocation of the effort between the initial sample and the recalls 
is as important as the usual theory for calculating a sample-size. 

3 Morris H. Hansen and William N. Hurwits, “The problem of nonresponse in sample surveys,” 
Journal of the American Statistical Association, vol. 41, (1946), pp. 517-29. 

* Z. W. Birnbaum and Monroe G. Sirken, “Bias due to nonavailability in sampling surveys,” Journal 
of the American Statistical A tation, vol. 45 (1950), pp. 98-111. “On the total error due to noninter- 
view and to random sampling,” International Journal of Opinion and Attitude Research, vol. 4: pp. 179- 
91. Cochran in his Sampling Techniques (Wiley, 1953) gives on page 296 an excellent summary of Birn- 
baum and Sirken’s results. 

4 Earl E. Houseman, “Statistical treatment of the nonresponse problem,” Agricultural Economics 
Research, vol. v (1953), pp. 12-19. 

5 The Politz plan was under discussion as early as 1945 in conversations between Mr. Polits and 
this author. Experimental work thereon commenced in 1946 in the Alfred Polits research organization, 
in which the weighting became routine through various simplifying procedures. Some theory and 
application were presented in a joint article by Alfred Politz and Willard R. Simmons, “An attempt to 
get the not-at-homes into the sample without call-backs,” Journal of the American Statistical Associa- 
tion, vol. 44 (1949), pp. 9-31. 

H. O. Hartley described what is essentially the Polits idea in a discussion of a paper that had been 
read by Yates at a meeting in London (see Frank Yates, Journal of the Royal Statistical Society, vol. 


cix (1946), p. 37 in particular), but Hartley made no mention of experimental work either accomplished 
or intended. 




















746 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


A further purpose of the paper is to compare the results and the costs 
of recalls with the alternative Politz plan. 


CRITERION FOR THE OPTIMUM PLAN 


We now define the root-mean-square error. The criterion to be 
adopted here for the optimum plan is that it shall deliver a prescribed 
mean square error at minimum cost. The root-mean-square error (to be 
abbreviated r-m-s error hereafter) of any plan of survey will by defini- 
tion denote the hypotenuse of a right triangle, one leg of which is the 
bias of the nonresponse that arises from the plan, and the other leg of 
which is the standard error of the plan (see Fig. 1). Different plans 
will have different triangles. By definition, the criterion for the opti- 
mum plan is that it shall give a shorter hypotenuse than any other plan 
will give for the same cost; or, alternately, a plan is optimum if it, 
among all possible plans, will deliver a prescribed length of hypotenuse 
at the lowest cost. One plan is “better” than another if it will yield a 


The standard error 
of response 











The Dias of nonresponse 


Ficure 1. Any plan of survey will possess a bias of nonresponse and a standard 
error of response. The right angle addition of the two forms the root-mean-square 
error of the particular plan. 


shorter hypotenuse than the other, for the same cost. There are a 
number of nonsampling errors in all surveys, whether complete or 
sample.® The bias of nonresponse is only one of them. It exists, of course, 
in complete counts as well as in samples. In fact, the conclusions to 
be reached at the end will point to some drastic re-orientation of the 
effort expended on complete counts. Both the bias of nonresponse and 
the error of sampling exist in sample surveys. These are the two errors 
that within any particular framework of design of sampling, inter- 
viewing, and questioning, are direct functions of the size of the sample 
and of the number of recalls. 





* A list of such errors with discussion is contained in Chapter 2 of Deming’s Some Theory of Sampling 
(John Wiley, 1950); and in an article entitled “On errors in surveys,” American Sociological Review, ix 
(1944) 359-69. 








costs 


[0 be 
ribed 
to be 
efini- 
8 the 
eg of 
olans 
opti- 
plan 
if it, 
nuse 
ld a 


dard 
uare 


re & 
» or 
Tse, 
3 to 
the 
and 
Ors 
ter- 
ple 


pling 


w, ix 








ERROR OF RESPONSE AND BIAS OF NONRESPONSE 747 


As one seldom knows the resultant magnitude of all the non- 
sampling errors, and as they vary from one survey to another, the most 
sensible magnitude to aim at for the r-m-s error of the combination of 
sampling and of nonresponse (the hypotenuse in Fig. 1) will vary like- 
wise. One might aim at a r-m-s error of 7% in one survey, at 10% in 
another, and at 20% in another. Even with unlimited expenditure to 
reduce the r-m-s error to very low proportions, other errors will still 
be present unless funds are diverted to reduce them also. 


QUANTIFICATION OF THE PROBLEM 


The probability mechanism or model will now be described. The 
population to be sampled will be divided into six classes, according to 
the average proportion of interviews that will be completed success- 
fully out of 8 attempts. The classes will be designated by 0, 1, 2, 4, 
6, 8 to denote 0, 1, 2, 4, 6, 8 interviews completed, on the average, out 
of 8 attempts. These figures will often appear as subscripts to various 
other symbols. Six classes will be sufficient : more classes would not alter 
the results enough to warrant the extra labor. 

We assume that under the conditions specified for any particular 
survey, failure to obtain an interview may arise from a multiude of 
causes, which are manifest as not at home and refusal. We assume that 
people that refuse are of two kinds, those that give permanent refusals 
and those that give temporary refusals. People that give permanent 
refusals will never respond to any kind of treatment (they are a part 
of Class 0 defined more explicitly later). People that give temporary 
refusals are the kind that will refuse sometimes but will grant inter- 
views at other times or to other interviewers. An example of a tem- 
porary refusal is a case where the wrong interviewer called, or the right 
one called at the wrong time—woman bathing the baby, indisposed, 
family at dinner, etc. An interview might have been obtained with 
better luck in timing, or better luck in the selection of the interviewer. 

Class 0 contains the stubborn core of permanent impregnable re- 
fusals, plus the people who are never at home, gone to Florida, etc., 
or who are drunk when you do finally find them, or who turn out to 
be incapacitated otherwise and can not possibly give meaningful an- 
swers. At this moment we may note that the magnitude of this class 
varies widely, dependent on the type of information called for by the 
survey, and on the procedure of getting it. In a census, when people 
are away, or refuse, or are incapable of giving information, a good 
share of the required information can usually be obtained from neigh- 
bors, and is, although information on income must usually in such 











748 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


cases be left unanswered. Thus Class 0 in a count of the number of 
inhabitants only is doubtless well below 1%, being reduced by the 
cooperation of neighbors. But in surveys whose express purpose js 
income, expenditures, savings, medical history, the neighbors are un- 
able to help, and Class 0 is bigger. I assume it to be 5% in the calcula- 
tions to be presented here. 

At the other extreme is Class 8, the people who 8 times out of 8 are 
at home and answer the questions. Moving inward from the soft outer 
shell (Class 8) toward the impregnable core, we encounter layers of 
increasing density. In Classes 6, 4, 2, 1 are the temporary refusals plus 
the people who are not home all the time. In Class 6 an interviewer will 
be successful at finding the respondent at home and in getting an inter- 
view, on the average, 6 times out of 8; in Class 4, 4 times out of 8; etc, 

Thus, we have not merely responding units and nonresponding units. 
Neither have we merely an overall proportion of response nor of non- 
response, but rather, response and (except for Class 8) nonresponse 
from each of several classes. We have not a mean value of some char- 
acteristic for the responses and some other value for the nonresponses; 
instead, each class possesses @ mean and a variance. We are concerned 
with the cumulative results from all classes. 


THE PATIENT MEAN 
We define the “patient mean” as 


8 8 
DL pias Dy pide 
—. main (1) 


=z Di ” 


1 





wherein a; is the mean value per sampling unit of some particular 
characteristic (rent, number of people employed, or something else) 
in Class 7, and p; is the proportion in this class. The patient mean will 
be the datum from which we reckon the biases in later calculations, 
and the unit in which we shall measure the bias and the root-mean- 
square error of any plan. It is the result of calling back patiently ad 
infinitum on all the people in Classes 1, 2, 4, 6, 8. The members of 
Class 0 will also be included in the recall because in practice we have 
no way of separating them out; but as they yield no response, they 
contribute nothing to the patient mean. 








ER 1953 


ber of 
y the 
OSE is 
e un- 
lcula- 


8 are 
Outer 
Ts of 
plus 
r will 
nter- 
; ete, 
nits, 
non- 
onse 
har- 
18€S ; 
rned 


(1) 


ular 
Ise) 
will 


uve 
ley 








ERROR OF RESPONSE AND BIAS OF NONRESPONSE 749 


THE INITIAL SAMPLE (ATTEMPT I) 


The treatment will be simplified by the assumption that the initial 
sample is the mere drawing of n names from a list of N names (the 
frame). A more complex plan will cause no important modification in 
the conclusions with respect to the necessity for recalls, nor with 
respect to the number of recalls required for the most economical plan. 
It will not modify seriously the comparison with the Politz plan. It 
will, however, change the absolute figures on cost, but these are not the 
aim of this study; they are auxiliary only. By further assumption the 
frame will be so large compared with the sample that the multinomial 
term 

n! 
Po Do™Pi™'p2™ + + > De" (2) 
No!ny!ne! ee ns! 
gives the probability that in the initial sample (Attempt I), there will 
be n; names in Class 7. n is the size of the initial sample. n; is a random 
variable; p; and n are constants, satisfying the equations 





Dm = 7 (3) 
8 
p> P= 1. (4) 


If the sample (n) is as great as 10 per cent or more of the frame, 
the variances and the biases to be computed should be reduced ap- 
proximately by the factor 1—n/N, in practice this reduction will be of 
negligible importance. ; 

When the returns from the initial call come in, we form from them 
the numerical average for some particular characteristic and denote 
it by z(I). According to the particular mechanism postulated, the com- 
position of x(I) will be the fraction 


Sum of all the numerical values in the responses of Attempt I 5) 
Number of responses in Attempt I ; 





z(I) = 


If we were able to separate the returns by class, this would appear as 


D Rex [Here and hereafter, sums will run over all classes (5a) 


2(I) = “> R,; except 0, unless indicated otherwise ] 


wherein R; represents the number of responses from Class 7, and 2; 











750 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


represents the mean of the R; responses. Both R; and 2; are random 
variables. Their expected values are 





Ex; = a; (6) 
ER; = nap; (7) 
where 
a 
= ry ° (8) 
The variance of x; will be 
Ci 1 — rip; 
Var x; = (1 + 22), (9) 
NT pi NTPs 


wherein o; is the standard deviation of the particular characteristic 
in Class 7. In what follows we shall drop terms in 1/n?; hence we shall 
have no further use for the term (1—7;p;)/n7ip; in the last equation. 

The quantity z(I) in Equation 5 is a random variable. Under the 
assumed probability its expected value will be 


E(I) = 7 (10) 
H 
and its variance will be 
Var () = — Diner + fas — BOY, (11) 
where for convenience 
G = > ipa; (12) 
H = > ip. (13) 


The derivation of Equations 10 and 11 is simple in the light of 
certain well-known principles of sampling. Let each sampling unit 
possess 8 cells , each one NR or R (NR for no response, R for response) 
according to the following distribution: 


Class0, 8 NR, OR 
Class1, 7NR, 1R 
Class 2, 6NR, 2R 
Class 4, 4NR, 4R 
Class 6, 2NR, 6R 
Class 8, ONR, 8R 








R 1953 


dom 


(6) 
(7) 


(8) 


(9) 


istic 
hall 
ion. 


the 


10) 


11) 


12) 
13) 
_ of 


nit 
se) 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 751 


Now when we draw a sample, we in effect draw first a sampling unit, 
which will belong to one of the above classes. Next, we draw 1 of its 
8 cells at random to determine whether we get a response. If we draw 
an R-cell (a response), we write down the number 2;;, which will be a 
random variable, the same for all the R-cells of an individual, but vary- 
ing from one individual to another. If we draw an NR-cell (no response), 
we make no record at all. The probability of getting a response in the 
double drawing (first, an individual sampling unit; second, a cell) is 
mpi, Which is only the expected proportion of all the responses that 
will fall in Class ¢. 

The mean of the entire set of responses in the frame will be 


> WiPia; 





G 
i Eade ae 14 
ay vee | (14) 
and their variance will be? 
ales — 2 
e _ pm Tipiloi + (a; LR) ] ” (15) 


>» Wipi 


The double drawing is a random procedure in which each cell has 
the same probability as any other in the entire frame. The mean of 
the returns of a sample will therefore give an unbiased estimate of 
the mean of the entire set of responses; but this is only a restatement 
of Equation 10. The expected number of responses in a sample of n is 
n >_ mpi, wherefore the variance of a sample of n will be very closely 
equal to o?/n > mp; but this is only a restatement of Equation 11. 
And thus Equations 10 and 11 are established.*® 

The bias in the expected result H(I) of Attempt I will be defined as 


B(I) = E(I) -— a*. (16) 
The mean square error of x(I) will then be 
Mse (I) = Var (I) + B(I). (17) 


If Figure 1 were drawn for Attempt I, the two terms on the right of this 
equation would be the squares of the two legs of the triangle, and the 
left-hand member would be the square of the hypotenuse. 





1? This is the formula for the variance of a composite universe; see, for example, the author’s Some 
Theory of Sampling (John Wiley & Sons. 1950), pp. 58 and 59. 

8 My colleague Dr. Benjamin J. Tepping discovered this simple way of deriving Equations 10 
and 11. He furnished also algebraic proofs, but they seem not to be required. 





752 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


ATTEMPTS I, III, IV 


The nonresponses left over from the first attempt form a new frame. 
The sampling plan may prescribe 0, 1, 2, or more recalls on a sample of 
these nonresponses. 

The Ist recall will be identified here as Attempt II. The 2d and 3d 
recalls will be Attempts III and IV. 

The determination of the optimum fraction (y) of the nonresponses 
of Attempt I to draw for recall will be a subject for investigation in a 
later paragraph. 

The bias of nonresponse arises from Classes 1, 2, 4, 6. Each successive 
attempt digs deeper into the lower classes, and diminishes the relative 
proportions that remain in the upper classes. Class 8 is in fact wiped 
out in Attempt I. In this way the combination of successive attempts 
pushes the accumulated result closer and closer to the patient mean a’. 

We assume that each attempt picks up a random sample of the non- 
responses in each class. This is not what happens, but it is probably 
impossible to put down an equation for what actually happens. The 
interviewers use ingenuity. They find out from neighbors when the 
people now absent will be at home. They make observations: they make 
appointments. They hold conferences to decide which one of them 
might best succeed in breaking down a refusal. Working for and working 
against the interviewers is some softening and also some hardening 
of the hearts of people who refused at an earlier call. I have seen them 
both. The net result is probably that the recalls are less costly (as 
Houseman says) than I assume in Table 3, and more successful than 
this theory indicates. If so, then the recommendations for recalls are 
even stronger than one may conclude from this theory alone. 

Equations 10 and 11 apply also for the results of Attempts II, 
III, IV, if n is treated in any attempt as the number of interviews at- 
tempted, and if p; in Equations 10-13 is replaced by: 


(1—m:) pi/ dD, (1 — mpi Attempt II 

(1-7)? pi/ >> (1 — mi)2px Attempt III 

(1—m.)* pi/ D (1 — ,)*p;. Attempt IV 
Class 8 contributes nothing to these sums, being wiped out by the fac- 
tor 1—2; which is 0 when 7=8. 


EQUATIONS FOR THE COMBINATION OF ATTEMPTS 


If the plan of survey calls for two recalls, we combine the results of 
Attempts I, II, III. With an obvious extension of notation, the result 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 753 


of this combination will be 
a(I + IT + IIL) = uwer(I) + wne(ID) + ume(II), (18) 
where U1, Un, Unt are weights. If Ri, Ru, Rm are the responses in the 
three separate attempts, then 
Ri, Ru, Ru 

Ri + Ru + Rin 
For the expected value of z(I+II+III) we may write with sufficient 
approximation 

E(I + 11 + Ill) = w:4(1) + wn E (II) + wm E (IIT), = (20) 


wherein wi, Wi, Wo, are the expectations of uz, uz, Um. Formally, with 
sufficient approximation, 


> tps, D ipe(l — mi), Do ipl — ms)? 
> ipi[t + (lL — ms) + (1 — #1)? 
Before proceeding, we note that 
uy + Un + Unt = i} 
wr + wn + wir = 1)” 
The bias of x(I+II+III) of the combined results of Attempts I, IT, 
III will be defined as 
BI+I11+ 0D = #1+ 11+ 11 — a. (23) 
The variance of x(I+II+III) may be computed as 
Var (I + II + III) = w;? Var (I) + wr? Var (II) 
+ wii Var (IIT). (24) 
The notation in the above equations can easily be extended or con- 
tracted to more or to fewer attempts. For a plan that uses only one re- 
call, we simply drop the symbol III; also the term (1—7;)? in Equa- 
tion 21. For a plan that uses three recalls, we annex a term in IV, and 
replace (1—2;)? by (1—7,)?+(1—7;)*. 
THE POLITZ PLAN 


The Politz plan includes questions to inquire of each person found 
at home, and who does not refuse, to ascertain whether he was at home 
last night at this time, the night before last, etc., to cover the 5 nights 
preceding the interview, 6 nights in all. Each return is given a weight 


(19) 





Uy, Ui, UI = 





(21) 


Wy, Wil, Wir = 


(22) 











754 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


w,, the reciprocal of the number of nights at home over the period of § 
successive nights. The result of applying the Politz plan will be the 
random variable 


Sw.R ir, 
Sw.R: 


wherein S denotes the sum over the 6 Politz classes, and wherein R, 
and zx; denote the number of responses and their mean value in the 
Politz class ¢. w,=6/(1+¢), where ¢ is the number of nights at home 
during the preceding 5 nights. w;, R;, and 2; are, all random variables, 

In each class except Class 8 it is possible for a person to be at home, 
during the preceding 5 nights, some number of nights other than his 
average (7;). Thus, E w, is not the reciprocal of r;, but takes the values 
shown in Equation 29. By applying the formula 


E 
v Ev 


2(P) = 





(25) 


it is possible to find the expected value of x(P) and to show that the 
Politz correction for not being at home leads to the bias 


B(P) = Ex(P) — a* [Definition | 


= ap — at — {(1- —) > (rop*(B - Ada - ar) 


oe 





1 
nV? pe rip:B (a; — ar) ‘ (27) 


The terms in the braces are very small numerically, and we accept 
with sufficient approximation for our purpose, 


B(P) = ap — a* (28) 
wherein 7;=7/8, as heretofore, and 
6 
A; = Ew, = E —— [For Class 7] 
1+¢ 
: 6 5 Assuming that ¢ is a 
=>) —( ya = #)‘s? binomial variate | 
t=0 1 + t t 
12.76 
= ( )a a 1 ;)o-*x! 
Ti gm 8 


= pons (1 = (1 = wi)*|, (29) 


7 








R 1953 


| of 6 
2 the 


(25) 
n R, 
| the 
ome 
bles, 
ome, 
1 his 
lues 


(26) 


the 


ion | 


27) 
ept 


28) 


sa 


| 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 755 


6 2 
Ew? = z(—) 
1+t 





B; = 

EEC) 

LE (eam t(Oe-n 
toto, (30) 
ESw;:R 2: 

ap = “ESwiRy 
et - Ft a 
V = > wipiAi. (32) 


The bias of a plan that uses kK—1 recalls may be written 


> pill — (1 — 2)* Ja: 
> pili — (1 — m)*] 


to the same approximation that appears in Equation 28. With k=2, 
for example, this form gives a numerical verification of the bias 
B(I+II+III) calculated otherwise by Equations 20 and 23. 

The variance of the Politz plan is® 


(33) 





B(I — K) = 


1 
Var (P) = nV? ) rip:B;[o? + (a; — ap) 2] 


1\ l 
+ (1 sa -)—= (rp)(B; — AP)(a:— ar). (84) 


It is worth noting that if we place A;= B;=1, the second term vanishes, 
and the right-hand member reduces precisely to Equation 15, as it 
should. 





® My equation for the Polits plan differs from the equations given by Polits and Simmons. 











756 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


SEARCH FOR THE OPTIMUM PLAN 


The accumulated mean square error of Attempts I, II, and III will 
be 


M(I + II + III) = Var (I + II + III) + BAI + If + III). (85) 


We drop the symbol III for a plan that calls for two attempts, and we 
annex IV for a plan that calls for four attempts. 

Any two plans may be expected to incur different costs and to yield 
different mean square errors. As agreed at the beginning, a plan is 
optimum if its cost is less than that of any other plan that will yield 
the same mean square error. This is a matter of numerical calculation. 

Numerical assignments to the various fundamental magnitudes (p;, 
a;, o;) will occur two sections ahead. 

We have one other task—the determination of the optimum frac- 
tion y, a subject for the next section. 


DETERMINATION OF THE OPTIMUM FRACTION OF NONRESPONSES 
TO INCLUDE IN THE RECALLS 


Let y denote the fraction sought. We remind the reader that At- 
tempt III will be a canvass of all the nonresponses that remain from 
Attempt II, and that Attempt IV will be a canvass of all the non- 
responses that remain from Attempt III. There is thus only the one 
fraction y to determine. 

The mean square error (M) of the accumulated result of any num- 
ber of attempts may be written in terms of n and y as 


M=A+B/n+C/ny, (36) 
the cost of which is 
Y = Dn + Eny, (37) 


A, B, C, D, E are constants. As before, n is the initial sample for At- 

tempt I. By differentiation it can be shown that, for a fixed value of Y, 
the minimum in M occurs when 

CD 

2 = —. 38 

= (38) 

This result is independent of n, hence it holds for any initial size of 

sample. 

The equation for y* just given contains D and E only in the ratio 

D:E, which shows that y does not depend directly on the absolute 

magnitudes to be assumed for the costs in Attempt I and later, but 








SER 1953 


II will 


(35) 
ind we 
) yield 
lan is 
l yield 
lation. 
es (pi, 


. frac- 


it At- 
from 
non- 
e one 


num- 


(36) 


(37) 


r At- 
of Y, 


(38) 
ve Of 
ratio 


lute 
but 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 757 


rather on the ratio of these costs. And as y will be proportional to 
VD:E, y is relatively insensitive to the ratio assumed for D: E. More- 
over, y is not dependent on the absolute magnitudes of the a;, but on 
their ratio to any one of them, or to a*, because B and C occur only in 
the ratio B:C. 

Table 4 shows the optimum values of y obtained from Equation 38; 
also the values selected for actual use in the calculations. The fraction 
y obviously varies slowly with the number of recalls. To simplify the 
required calculations I have set y=3/5 for all plans with the first set 
of a;; and y= 1/4 for all plans with the second set of aj. 

It may be of interest to note that the removal of the bias of non- 
response by recalls is independent of the fraction y. It is not necessary 
to recall on the optimum fraction, nor on any other particular frac- 
tion, so far as the bias of nonresponse is concerned. However, as y de- 
creases, the cost goes down but the variance and the r-m-s error in- 
crease, 80 it is wise not to make y tvo small. The optimum fraction, if 
it can be predicted on experience, or some approximation thereto, will 
guide one close to the minimum r-m-s error for any permissible cost of 
interviewing. 

NUMERICAL MAGNITUDES ASSUMED 


In order to make numerical calculations and to derive conclusions 
therefrom with respect to the most economical design of surveys, it is 
necessary to assume some numerical magnitudes for the p;, o:; also for 
the costs. Unfortunately, no set of numerical magnitudes can be typi- 
cal of all conditions met in the field. I may interject the reminder that 
every question on a questionnaire has not only its own particular val- 
ues of a; and of o;, but of p; as well, even within the same survey, be- 
cause some questions receive better cooperation than others. The best 
that one can do is to make numerical assumptions that fit some of the 
conditions met in practice, and to infer from the equations the range 
of validity of the conclusions. 

The basic numerical assumptions are in Table 1. The expected num- 
ber of interviews, of responses, and of nonresponses, are shown in Ta- 
ble 2. The response rates (the p;:) assumed here are intended to assimi- 
late average urban experience on a question of moderate difficulty; and 
without making them responsible for the final choice, I wish to thank 
Messrs. Lester R. Frankel and Robert Weller of the Alfred Politz Re- 
search organization for their help and interest in choosing these par- 
ticular values. 

Fortunately, there is a great deal more generality in the two sets of 
a; than may be apparent at first sight, for one may transform either one 








758 





TABLE 1 


NUMERICAL VALUES ASSUMED 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER ioy 














Cl 
Property and _ 
symbol o|/1 2 4 6 8 1-8 
Proportion p; .05 10 .10 .20 .25 ~~ = .30 95 





Mean value 


of the a; (1st) | xxx 
measured 

character- ja; (2d) | xxx 
istic J 


2.00 1.75 1.50 1.25 1.00 


10 .20 .40 .60 1.00 


a* =1.355 263 


a* =0.589 474 





Standard deviation 
o% 


XXX 


Same as a; in both sets of a; 




















TABLE 2 


THE EXPECTED SIZES OF SAMPLE IN THE VARIOUS AT- 
TEMPTS, BASED ON AN INITIAL SAMPLE OF n IN 
ATTEMPT I. HERE THE SUMS RUN OVER 
ALL CLASSES, 0 TO 8 











Attempt Interviews Responses Nonresponses 
I n n>. mips n>,(1 -_ i) Pi 
II nu =ny 2) (1-7) ps ny 2 (1 — i) mpi ny 2 (1 —2s)2p 
III nm =ny > (1 —7)?p; ny > (1 —m4)?2ipi ny (1 —7:i)*p; 
IV ny =ny >, (1—2)*p; ny Ya — x) * api ny mre — i) *pi 
Vv ny =ny D)(1—2i)*pi ny D(1— i) *eips ny 2 (1 —2:)5p; 
VI ny =ny (1 —7i)*p; ny we — ai) api ny (1 — mi) *p; 
VII nyll =ny wre! —7;i)*p; ny we — xj) 8xypi ny wre! — xi)" pi 











Numerical values based o 


n an initial sample of n = 1000 





III 
IV 


VI 
VII 





n =1000 
nu =375 .Oy 
ni = 248 .4y 
my =188.ly 
ny =153.7y 
nyt =131.5y 
nyn =115.9y 








625.0 375.0 
126 .6y 248 .4y 
60 .3y 188.1ly 
34.4y 153.7y 
22.2y 131.5y 
15.6y 115.9y 
11.7y 104.2y 








= «<« &S. & 





ER i983 


5 263 


9 474 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 759 


of these sets into almost any other that he may encounter. For ex- 
ample, to discuss a yes-and-no survey in which the proportions of yes 
vary from 60% in Class 1 to 40% in Class 8, one has only to derive a 
new value a,’ from an old a; by setting 


a;’ = 20 + 20a; (39) 


where a; on the right belongs to the 1st set of a; in Table 1. Both a; 
and a;’—20 have a 2-fold variation from Class 1 to Class 8. The new 
patient mean is 


a’* = 20 + 20a* (40) 


where a* = 1.355 263, the patient mean of the Ist set of a;, as given in 
Table 1. The relative bias computed for a;’—20, for any number of at- 
tempts, will be precisely the same as the relative bias computed for a; 
(Table 5). It follows that the new expected value for any number of 
attempts will be 


E’ = 20a* Rel B + a’* 
= 47.105 + 27.105 Rel B (41) 


where Rel B is the relative bias shown in Table 5 for the corresponding 
number of attempts. An example will occur later (Table 9). 

The 2d set of a; could serve the same purpose by a suitable transfor- 
mation, but we shall not carry it through. 

Thus, in spite of the limitations of any particular set of numerical 
assumptions, the conclusions to be drawn will warrant some sweeping 
generalizations. 

COSTS 


For the costs of making calls (interviewing only) we assume for cal- 
culation the following figures: 


For Attempt I, $3 per call 
For later attempts, $5 per call 
For the Politz plan, $4 per name 


This amount will cover the cost of weighting 
and of calling back on the temporary refusals. 


Table 3 shows the costs of interviewing derived from the values as- 
sumed for the p; in Table 1, and with the cost per call as mentioned 
earlier. n is the size of the initial sample, and y is the fraction of the non- 
responses left over from Attempt I that constitute the sample for At- 
tempt IT. 





760 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 19 


TABLE 3 
COSTS OF INTERVIEWING 








Plan No. of recalls Cost (dollars) 





Y =3n 
3n +1.8750ny 
3n +3 .1172ny 
3n +4.0576ny 
3n +4 .8263ny 


Attempt I 0 
Attempts I+II .3750ny 
Attempts I—{II -6234ny 
Attempts I-IV .8115ny 
Attempts I-V .9653ny 
Attempts I—VI 1.0968ny 3n +5 .4839ny 
Attempts I—VII 1.2126ny 3n +6 .0632ny 
Politz (equivalent to 5 recalls) 4n 





The actual numerical magnitudes of these costs are not so important 
as their relative magnitudes. If all the costs were doubled, the cost 
computed for any plan will be doubled, but the relative costs and the 
relative merits of the various plans would remain unchanged. 


TABLE 4 


RESULTS FOR THE OPTIMUM y, AND THE VALUES 
SELECTED FOR THE CALCULATIONS THAT LED 
TO TABLES 5 AND 6, AND TO FIGS. 2 AND 3 








1st set of a; 2d set of a; 





y calculated 
from 
Equation 38 


y selected 
for 
calculation 


y calculated 
from 
Equation 38 


y selected 
for 
calculation 





I—II 
I-III 
I-IV 
I-V 
I-VI 
I-—VII 





.69 
.67 
65 
.63 
-61 
.60 








33 








It should be noted that these costs are for the interviewing only. 
Considerations of overhead costs, training, and office-work for the 
different plans must be taken into account before one decides definitely 
whether one plan is more economical than another. 


CONCLUSIONS FROM THE CALCULATIONS 


The numerical results of the calculations are in Tables 5, 6, 7, 8 and 
in Figs. 2 and 3, The biases and r-m-s errors are expressed in units of 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 761 


TABLE 5 


NUMERICAL VALUES OF THE BIASES AND R-M-S ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (n); 1st SET OF ai, 
y=.6. COSTS AT n=1000 








Plan I I+II I-III I-V 





Rel bias —.110874 -075752 | —.057302 — .036800 


n=100 
Rel m-s-e .025974 -019918 -017628 r -015938 
Rel r-m-s-e - 161164 .141131 - 132773 ‘ - 126246 


n=200 
Rel m-s-e .019133 .012828 .010456 F -008646 
Rel r-m-s-e . 138322 . 113261 - 102255 j .092984 


n=300 
Rel m-s-e -016853 .010464 -008065 dl -006215 
Rel r-m-s-e . 129822 . 102294 .089805 d -078834 


n=500 
Rel m-s-e -015029 -008574 -006153 
Rel r-m-s-e - 122593 -092596 -078441 


n=1000 
Rel m-s-e -013661 .007156 .004718 
Rel r-m-s-e - 116880 .084593 -068688 


n=2000 
Rel m-8-e .012977 -006447 -004001 
Rel r-m-s-e - 113920 .080293 


n=3000 
Rel m-s-e -012749 -006211 
Rel r-m-s-e -112911 .078810 


n=5000 
Rel m-s-e -012567 -006022 
Rel r-m-s-e -112103 





Costs at 
n=1000 $3000 


























a*, The base for the bias is the 0-point of the scale for the a;. The esti- 
mation is assumed to be a summation of the initial call and the recalls. 
The aim is assumed to be the estimation of an average or of a total. 


A. Conclusions from the 1st set of a;, a 2-fold variation from a, to ag: 
Table 5 and Fig. 2. Conclusions 1, 2, 3, 4, and 5b are independent of the 
type and size of sample. 


1. With no recalls at all (Attempt I only), the minimum relative 





762 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


r-m-s error attainable is 11%. No sample however big, not even a com- 
plete count, can penetrate below this minimum, without recalls. 


30% [ 





Rmse (1) 
-B(I) 





RELATIVE ERROR 


Rese (I *1l) 
-B(L* Il) 
+Rmse (I-11!) 
-6 (1-111) 
+ Rmse(I - VII) 
=B(I-Iv)7 2-B(1-V) 
-B (I-VI) 

*-B (I-VI) 

l i. 
1000 1 vit 
I-VI 
s*¥ 
I-IV 
I-10 




















I+1l 


2 
= 
S 
S 
= 
$ 
~ 
+2) 
Ss 
S 








500 
SIZE~ OF INITIAL? SAMPLE 


FicureE 2. The relative bias, the relativer-m-s error, and thecost, plotted against 
the initial sample-size (n) for various plans, for the Ist set of a;, in which a,=2 as. 
The curves show the futility of attempting to achieve accuracy by sheer size of 
sample. Recalls are much more effective. The dashed lines show the size of 
sample required, and the cost, to yield a relative r-m-s error of 73%. The relative 
biases and the relative r-m-s errors are in units of a*. 


2. With one recall (Attempts I+II), the minimum r-m-s error drops 
to 7.6%. No sample however big can penetrate below this minimum 
with only one recall. 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 763 


3. With 2 recalls (Attempts I+II+III), the minimum r-m-s error 
drops to 5.7%. No sample however big can penetrate below this mini- 
mum with only two recalls. 

4. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 4.5%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 3.7, 
3.0, and 2.5%. 

5. To attain a prescribed r-m-s error of (e.g.) 73%: 


(a) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 


From Fie. 2 








No. of recalls Initial sample Cost 





345 $2,290 
378 2,390 
408 2,450 
512 2,800 





(b) With 0, 1, or 2 recalls we can not attain the prescribed r-m-s error (73%) 
with any sample however big. 


B. Conclusions from the 2d set of a:, a 10-fold variation from a, to as: 
Table 6 and Fig. 3. Conclusions 6, 7, 8, 9, and 10b are independent of the 
type and size of sample. 


6. With no reealls at all (Attempt I only), the minimum r-m-s error 
attainable is 24.5%. No sample however big, not even a complete 
count, can penetrate below this minimum without recalls. 

7. With one recall (Attempt I+II), the minimum r-m-s error drops 
to 15.5%. No sample however big can penetrate below this minimum 
with only one recall. 

8. With 2 recalls (Attempt I+II-+III), the minimum r-m-s error 
drops to 11.3%. No sample however big can penetrate below this mini- 
mum with only two recalls. 

9. With 3 recalls (Attempts I-IV), the minimum r-m-s error drops 
to 8.7%. With 4, 5, and 6 recalls, the minimum r-m-s error drops to 
6.9, 5.6, and 4.7%. 

10. To attain a prescribed r-m-s error of (e.g.) 10%: 


(a) We may use 3, 4, 5, or 6 recalls with initial samples as shown in the ac- 
companying table. 











764 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 











From Fria. 3 
No. of recalls Initial sample Cost 
6 210 $ 875 
5 245 1,010 
4 325 1,380 
3 730 2,910 





(b) With 0, 1, or 2 recalls we can not achieve the prescribed r-m-s error (10%) 
with any sample however big. 


TABLE 6 


NUMERICAL VALUES OF THE BIASES AND R-M-S ERRORS FOR 
VARIOUS SIZES OF INITIAL SAMPLE (n); 2d SET OF aj, 
y =.25. COSTS AT n=1000 














Plan I I+II I-III I-IV I-vV I-VI I-VII 

Rel bias .245190 - 155062 . 112665 -086955 -069408 -056593 -046815 
n=100 

Rel m-s-e .091985 -046455 -032012 .025385 .021763 .019563 -018135 

Rel r-m-s-e .303290 - 215534 - 178919 - 159327 . 147523 - 139868 - 134666 
n=200 

Rel m-s-e -076052 -035250 -022353 .016473 -013290 -011383 -010164 

Rel r-m-s-e . 275775 - 187750 - 149509 . 128347 - 115282 - 106691 - 100817 
n=300 

Rel m-s-e .070741 -031515 -019133 -013502 -010466 -008656 .007507 

Rel r-m-s-e - 265972 -177525 - 138322 - 116198 - 102303 -093038 -086643 
n =500 

Rel m-s-e -066492 -028527 -016558 -011125 -008207 -006475 -005381 

Rel r-m-s-e - 257860 - 168899 - 128678 - 105475 -090592 -080467 -073355 
n=1000 

Rel m-s-e -063306 -026286 -014626 -009343 -006512 -004839 -003787 

Rel r-m-s-e - 251607 - 162130 - 120938 -096659 - 080697 -069563 -061539 
n = 2000 

Rel m-s-e -061712 -025166 -013660 -008451 -005665 .004021 -002990 

Rel r-m-s-e - 248419 - 158638 - 116876 -091929 -075266 -063411 -054681 
n =3000 

Rel m-s-e -061181 -024792 -013338 -008154 -005383 .003748 -002724 

Rel r-m-s-e . 247348 - 157455 - 115490 -090299 .073369 -061221 .052192 
n =5000 

Rel m-s-e -060756 .024493 -013080 -007917 -005157 -003530 -002512 

Rel r-m-s-e . 246487 - 156502 - 114368 -088978 -071812 . 059414 -050120 
Costs at 

n=1000 $3000 3469 3779 4014 4207 4371 4516 
































; 1983 


'%) 


B4 
17 


7 
13 


= 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 765 
























































™| 5, eae 
a Rmse(l) 
25% 8 (1) 
& 
& 20% F 
& 
ly 
- ; Rmse (1+ 11) 
> B(I+11) 
zc 15% 
S 
ry Rmse (1-111) 
B (I-I1l) 
10% Rmse(1-IV) 
B(I-IV) 
: Rmse (1-V) 
' Rmse (I-VI) 
: : Rmse (I-VI1) 
= 3 oe So-Vin. 
° ae ud i a on 
or: 500 ' 1000 
Gy an) ' . 
= a la : 
= ‘ Oi io 2: 
zx She oe >. I-Vl 
4 sis 1s ®: 1-v 
S Peg : I-IV 
S ; 4 : 1-1 
uw $4000/ , 4 Ion 
S : ' l 
~ L. $2910 25 Lo is. decane akin aaaliedl 
» : ’ 
8 ‘= 
$2000} ay 8 
$1380 . 
Le 
$8757 
rs) fa és z l P —— ‘ | 
500 1000 








a, SIZE OF INITIAL SAMPLE 


Fiaure 3. The relative bias, the relative r-m-s error, and the cost, plotted 
against the initial sample-size (n) for various plans, for the 2d set of a, in which 
a;=.1 as. The curves show the futility of attempting to achieve accuracy by 
sheer size of sample. Recalls are much more effective. The dashed lines show the 
size of sample required, and the cost, to yield a relative r-m-s error of 10%. 
The relative biases and the relative r-m-s errors are in units of a*. 


C. General conclusions 


11. Even with three recalls, with the level of response assumed in the 
calculations (taken from average urban experience), a sample bigger 
than the binomial equivalent of from 300 to 500 for an estimate of any 
one class is ineffective and uneconomical. A plan that would reap any 
real benefit from bigger samples must support 4 or 5 or more recalls. 








766 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


12. An attempted “complete count” is no exception, and often rep- 
resents an extreme waste of effort. 

13. With the proportions of nonresponse assumed here, high ac- 
curacy can be attained only with 4, 5, or 6 recalls, along with an initial 
sample equivalent to from 800 to 1500 binomial cases. Careful con- 
sideration should therefore be given in the planning to decide whether 
the need for extreme accuracy warrants the required expense and delay 
occasioned by recalls beyond the 3d, and for an initial sample bigger 
than the binomial equivalent of n =300 in any subclass of the universe 
for which an estimate is desired. 

14. Table 8 shows that where extremely high accuracy is required, 
the Politz plan with 2000 or more binomial cases becomes competitive 
in cost with a survey that depends on recalls. In any case, the Politz 
plan has the advantage of speed, and of being able to produce results 
under circumstances wherein recalls are impossible. 

15. Because one kind of experience may be translated into another 
by transformations similar to Equation 39, the generality of the above 
conclusions and their impact on the design and interpretation of sur- 
veys and of complete counts are inescapable. A limiting case of excep- 
tion occurs, of course, when the range of variation of the a; is small 
compared with a*, 

16. The above conclusions with respect to the number of recalls re- 
quired are generally applicable to all types of sample-design for draw- 
ing the sampling units. A change in sample-design (as from the bi- 
nomial sampling of individuals to samples of areas) only changes (usu- 
ally widens) the distance from the bias to the r-m-s error in Figs. 2 and 
3, without raising or lowering the bias. The most economical number (n) 
of interviews in an area sample, for any given number of recalls, will 
for most characteristics be bigger than the figures mentioned in con- 
clusions 11, 13, and 14. The increase may range from 0 on up to some- 
times double, depending on the characteristic and the clustering effect 
of the interviewers’ workloads. 


IMPACT ON DESIGN 


The most impressive feature of the results is the heavy bias of non- 
response, when no provision is made to reduce it, even though there 
be but a 2-fold variation from a; to ag. 

The second most impressive feature is the fact that if nonresponse 
reaches anywhere near the proportions (p;) assumed, then when the 0 
of the scale of the a; is not large, we can not afford, except for special 











1 rep- 


h ac- 
nitial 

con- 
ether 
lelay 
igger 
verse 


ired, 
itive 
olitz 
sults 


ther 
Dove 
sur- 
cep- 
mall 


3 re- 
“aWw- 

bi- 
usu- 
and 
'(n) 
will 
:0n- 
me- 
fect 


On- 
ere 


e 0 
cial 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 767 


justification, to plan for extreme accuracy: it is simply too expensive. 

This conclusion is also borne out by Table 7, which shows that more 
information per dollar comes from a sample of 500 than for a sample 
of 1000; and that every successive recall shows a gain in the amount of 
information obtained per dollar, particularly for the smaller sample 
size. An optimum is not reached even with six recalls. In other words, 
as we concluded earlier from Figs. 2 and 3 and from Tables 5 and 6, we 
get more for our money by taking a moderate initial sample and dig- 
ging deep into it with many recalls. However, many recalls delay the 
day on which the tabulations will be ready, and one may be forced to 


TABLE 7 


THE AMOUNT OF INFORMATION PER UNIT COST FOR THE SEVEN 

PLANS (FROM 0 TO 6 RECALLS), FOR INITIAL SAMPLES OF 500 AND 

1000. INFORMATION IS DEFINED AS THE RECIPROCAL OF THE 
REL M-S-EIN TABLES5 AND 6. THE COST COMES FROM TABLE 3 























Ist set of a; 2d set of a; 

Plan 

n =500 n =1000 n =500 n =1000 
I .044 358 .024 400 .005 013 .005 265 
I+II .056 562 .033 877 .020 216 .010 966 
I-III .066 744 .043 522 .031 954 .018 092 
I-IV .074 326 .052 524 .044 787 .026 664 
I-V .079 422 .060 315 .057 912 .036 501 
I-VI .082 438 .066 519 .070 649 .047 278 
I-VII .083 856 .071 127 .082 302 .058 472 








call a halt at 3 or 4 recalls. Where speed is urgent, or where recalls are 
otherwise inadvisable, one may bear in mind the Politz plan, which of- 
fers a rapid solution with recalls only on the temporary refusals. 
With the usual method of estimation (pooling the initial call and the 
recalls) the best way to attain accuracy is to build up the initial re- 
sponse (i.e., to increase ps). One or two recalls would then be much 
more effective than they are under the conditions assumed; and bigger 
samples would also be more effective. Observations on the proper time 
of day to find certain kinds of people at home in a particular area, and 
willing to answer questions, plus a skillful introduction and approach 
so as to cut down refusals, are known to be helpful in this direction. 
An attempted complete count is no exception to the conclusions 
reached. Without a highly successful initial response, followed by some 
effective number of recalls, 95% of the energy put into a complete 








3 
= 
=] 
_ 
= 
= 
a 
oO 
x 
=) 
i) 

< 
Z 
= 
=) 
io) 
~ 

Z 

° 
rar 
& 

< 

— 
o 
o 
RQ 
DQ 
< 
nw 
< 
oO 
= 
2 
& 
< 
& 
M 
Z 
< 
oO 
— 
food 
a 
= 
< 


768 





(q) ®-#-UI-2 joy 
(d) 28A PPM 
(a) a a 





u 





+ JO 498 Pz 





£90 290° €Z¢ 980° 
$Zg £00° ZZ 900° 
we 080"— ¥8e 080° — 


OlP 8ZI° 
99¢ SI0° 
¥8e 0F0°— 


(dq) o-e-ar-s [9 
(d) 8A 12 
(a) a 19 





000T 


oo¢ 


002 











*D JO 408 48] 


NV1Id ZLITOd GHL JO WOUNASWN-U AAILVTAY AHL ANV SVIG AAILVTAY AHL 


8 ATAVL 





ERROR OF RESPONSE AND BIAS OF NONRESPONSE 769 


count, taken to obtain an estimate for a large area, may be wasted. 
Size does not atone for nonresponse: this is all too evident from the 
calculations (Tables 5 and 6; Figs. 2 and 3). 

The mechanism adopted here is a device by which experience can 
be accumulated and pointed toward the attainment of (a) greater ac- 
curacy per unit cost, and (b) less waste, through conservation of un- 
productive effort expended on samples that are too big. Good guesses 
for the constants p:, a:, 0; can almost always be made on the basis of 
past experience; and the calculations made with them will indicate a 
plan not far from optimum. Continued experience will provide im- 
proved numerical values for the constants, and continually improved 
design and interpretation of the results. Without a probability design 
of some sort, it is difficult to capitalize on experience. 

Although the discourse here has been entirely in terms of interviews, 
the results are equally applicable to surveys in which the initial at- 
tempt is made by mail, or in which all attempts are made by mail. 
Appropriate changes must of course be made in the numerical values of 
the constants. Thus, if the mail were used for Attempt I, and if inter- 
views were used for the recalls, then the cost D in Equation 38 would 
be much less than it is when interviews are used in Attempt I, and y 
will then be smaller. For example, if the cost of a mailed questionnaire 
were $.75, and if the cost of an interview on a nonresponse were $5, 
then y would reduce to perhaps as low a figure as 1 in 6, depending of 
course on the other constants in the equation. 

One may well wonder what the biases are in surveys that depend 
only on a mailed survey with a 15% total response, or even 30% or 
50%, without calls on the nonresponses. The mechanism adopted here 
shows that it is a mystery how such results can be worth anything at 
all. 


IMPACT ON METHODS OF ESTIMATION 


After the returns from the survey are in, there remains the problem 
of estimating the mean per sampling unit, and the standard error of 
this estimate. As the survey does not touch Class 0, it can by itself 
only produce estimates for Classes 1-6. 

The usual practice of combining the various attempts (after weight- 
ing Attempt ITI and higher attempts by the factor 1/y) may be both 
misleading and inefficient. A glance at Table 5 or at Figure 2 shows that 
41% of the bias still remains after the 3d recall, and that 27% still 
remains after the 5th. Table 6 and Figure 3 are equally discouraging. 
The decreasingly slow ascent toward the vertex of 0 bias may explain 











770 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


how easy it is to conclude, incorrectly, that after 3 recalls there is little 
more bias to squeeze out, and that additional recalls are not worth their 
cost. 

To illustrate the usual procedure, let us make some calculations on 
a yes and no survey.!° The proportions of yes in the various classes will 
range, let us suppose, from 60% in Class 1 down to 40% in Class 8, 
following the relative values of 20(a;+1) derived from the Ist set of a; 
in Table 1. Table 9, calculated with the aid of Equation 41, shows the 
expected results of combining 2 attempts, 3 attempts, etc. The result 
that we really need is the patient mean, shown at the bottom of the ta- 
ble as the expected result of continuing the recalls indefinitely. The 
siow progress of the combined result is obvious; also the need of some- 
thing better. 


TABLE 9 


THE EXPECTED PROPORTIONS OF YES, FOR SEVERAL PLANS, 
COMPUTED BY EQUATION 41. THE PROPORTIONS OF YES 
RANGE FROM 60% IN CLASS 1 TO 40% IN CLASS 8 























“a Expected proportion Bias 
Yeo No remaining 
Attempt I 44.10 55.90 100.0% 
I+II 45.05 54.95 68.4 
I+II+III 45.55 54.45 51.8 
I-IV 45.88 54.12 40.9 
I- V 46.11 53.89 33.2 
I-VI 46 .28 53.72 27.6 
I-VII 46.42 53.58 22.8 
Infinity a’* =47.11 52.89 0 








What we need is a way to extract more information from the recalls. 
A more efficient estimate may be contained in a scheme for extrapolat- 
ing the results of the various attempts, as proposed by Hendricks." 
The mechanism proposed here will provide a rational scale for the 
extrapolation. It may be that the scale proposed by Hendricks is ap- 





10 J am indebted to Dr. Leo P. Crespi and to Mr. Fred W. Trembour of the Reactions Analysis 
Staff in the Office of the High Commissioner for Germany, who in several conversations with the author 
brought up questions and suggestions that led to this illustration. 

1 Walter A. Hendricks, Chapter 5 in the book Agricultural Estimating and Reporting Service (Miscel- 
laneous Publications No. 703, Bureau of Agricultural Economics, Washington, 1949); pages 31-35 in 
particular. 





ws the 
> result 
the ta- 
y. The 
-some- 


calls, 
olat- 
cks."! 
r the 
S$ ap- 


nalysis 
author 


Mliscel- 
~35 in 


ERROR OF RESPONSE AND BIAS OF NONRESPONSE 771 


propriate, or it may be that some other scale will give more accurate 
results with convenience. 

For an estimate of a* by extrapolation, we may look upon recalls as 
necessary to provide the required coordinates of points by which to 
make the extrapolation, and not merely to provide additional returns 
to add to the initial attempt. 

For this new type of estimate, the standard error would not be cal- 
culated in the usual way (Equation 24), but as the standard error 
of the intercept on the scale along which we read, by extrapolation, the 
estimate of a*. New theory will be required for the optimum allocation 
of effort amongst the various recalls, and for effecting the extrapola- 
tion; also for calculating its standard error. It may turn out, for ex- 
ample, that unless one can achieve extremely high initial response, 
approaching 90%, there may be little point in expending funds to build 
it up. It is possible that theory beyond the scope of this paper may lead 
to efficiency and reliability far beyond those attained in practice today. 


SOME REMARKS ON CLASS 0 


We must face the fact that our survey can at best only provide esti- 
mates for Classes 1-6, although it can also give us the proportion po 
and some of the characteristics of Class 0. The administrative decisions 
that the survey was expected to help may nevertheless involve Class 0 
along with the others. In a marketing study, for example, the people in 
this class may be heavy purchasers of the very commodity that forms 
the subject of the survey. They may in part be people who travel much, 
and who may thus be important to a railway, an air line, a manufac- 
turer of automobiles, a hotel, and to others. They may be people in 
high income groups. It may therefore be important to learn how much 
we are missing by not bringing them into the survey. 

Unfortunately, it is impossible to learn this magnitude from the sur- 
vey itself. The only possible approach seems to be from outside sources, 
such as through statistics on the total movement of a particular product 
from wholesale into retail stores. It is possible in many cases to gather 
outside evidence by which to evaluate approximately the magnitude 
of ao (the mean in Class 0), or rather of the total aopo in Class 0, for 
some of the important characteristics that affect the decisions or relate 
to them. The next step is to ascribe upper and lower bounds to the pos- 
sible magnitude of aopo, and thus to infer the possible effects of Class 0 
on the uses and limitations of the data.” 





2 This suggestion came from Professor Philip M. Hauser in an informal conversation in regard to 
this research. 











772 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


The difficulty with Class 0 is not peculiarly a sampling problem, as 
Class 0 appears in complete counts as well as in samples—in fact, it is 
undoubtedly bigger in complete counts than in samples. 


ACENOWLEDGMENTS 


This research could not have been completed without the generous 
assistance of a number of friends. In particular, my wife, Lola §, 
Deming, carried out the major portion of the calculations, and took 
charge of the preparation of the final drafts of the manuscript. Mr. J. E. 
Ball of the Monroe Calculating Machine Company lent me a machine 
for a long period while my own machine was under repair. Dr. Ben- 
jamin J. Tepping found a fundamental flaw in an earlier draft, and ren- 
dered indispensable assistance in the derivations of Equations 10 and 
11. Messrs. Lester Frankel and Robert Weller of the Alfred Politz 
Research organization rendered prompt and helpful assistance on a 
number of occasions. My colleagues, Morris H. Hansen, William N. 
Hurwitz, and George Kuznets, kindly offered suggestions toward clari- 
fication. 





1953 


» a8 
it ig 


rous 
. S, 
ook 


ine 
en- 
en- 
ind 
itz 
1a 
N. 





EFFECT OF WEIGHTING BY CARD-DUPLICATION 
ON THE EFFICIENCY OF SURVEY RESULTS 


Invinc RosHWALB 
Opinion Research Corporation 


URVEY designs often specify that weights be applied to certain 

groups of observations as part of the estimation procedures. These 
estimation procedures are designed as complements of sampling in- 
structions which may, for example, specify methods for eliminating 
call-backs,' or the disproportionate allocation of the sample to the sev- 
eral strata.? A third example of the need for weighting may be taken 
from the fact that the denominator of the sampling fraction, 1/k is, fre- 
quently a non-integral divisor of the size of the population to which the 
fraction is applied. This means that the number of cases obtained is 
very often not equal to the number of cases desired, and that some 
weighting adjustments may thus be called for. The subject of this note 
is the specific problem of the effect of non-integral weights (e.g., 1.2, 
1.75, 8.4, etc.) on the sampling variability of the survey results. The 
two weighting procedures described below, arithmetic and card-duplica- 
tion, give identical results when the weights are integral. This identity 
disappears when the weights are non-integral, for card-duplication 
involves sampling the returned questionnaires for reproduction. A dis- 
cussion of the problem for the case of stratified sampling, using dispro- 
portionate allocation, offers a useful solution. 

When a sample design calls for the disproportionate allocation of the 
sample to the several strata, we have seen that there are two alternative 
estimation procedures 

(a) Arithmetic weighting: If W; is the proportion of the population in the 
ith stratum and #; is the sample estimate of the mean in the ith stratum, 
then 2=)_ W,2; is an unbiased estimate of the population mean, y (i =1, 
2, oo, L). 

(b) tortures weighting: If the rerults of the survey can be recorded 
on punch-cards, then it is possible to weight the N; observations in each 
stratum to their proper weight in the sample by drawing a random sample 
of n; cards from the original N; cards so that the total number of cards in 
the stratum, (N;+7;), is equal to Wi(N +n), where (N +n) is the total 
number of cards in all strata, including both the original and the duplicate 


cards. 





1A. Polits, and W. Simmons, “An attempt to get the not-at-homes into the sample without call- 
backs,” Journal of the American Statistical Association, 44 (1949), 9-31. 
2 W. E. Deming, Some Theory of Sampling. New York: John Wiley and Sons, 1950, p. 215. 


73 











774 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


In the case of arithmetic weighting, the variance of Z is known to be 


Var = = >> W?? Var i. (1) 


ta] 


In the case of card-duplication, the variance of the sample estimate 
of the mean may contain an additional component of variation due to 
the sample of n; cards drawn from the original sample N; in the ith 
stratum. Thus, if n;=0, or n;=dN;, where d is an integral number, then 
this additional component is equal to zero. If, however, n:/N;>0 and 
non-integral, this additional component may be greater than zero. Dis- 
regarding the stratal index, we may then seek the value of n (the num- 
ber of cards to duplicate) which maximizes the sampling error of an 
estimate based on the N+ n cards, original and duplicates, and, coin- 
cidentally, examine the effect of card-duplication on the variance of the 
estimate. 

Assume that a random sample of N items be drawn without replace- 
ment from a population of size P and with mean yz and variance o*. 
Draw n cards at random without replacement from the N, duplicate 
these n cards, and estimate y» on the basis of the N +n cards. This last 
estimate is 





m : {2 , i+ yah (2) 


se N + n to] n+1 


and 


Vy 7 o? {1 CS au ae . 
a * I ( n ag . (3) 








Considered as a function of n, 


o P—N 
Varm = —- 
N P-1 





for n=0 and n=N. Var m attains its maximum value when n=WN/3. 
When 


o? (3P — 4n 
n = N/3, Var m = — 4h 
n (8(P — 1) 


In order to study the effect of weighting by card-duplication on the 
variance of the sample mean, we might first compute the relative infor- 








(1) 


mate 
le to 
> ith 
then 
and 
Dis- 
um- 
f an 

oin- 

the 


LCe- 
ate 


ast 


(2) 


Ae 





WEIGHTING BY CARD-DUPLICATION 775 


mation® of this estimate, where the relative information, computed as 
the ratio between Var Z and Var m, is 


= (1 + d)? 
(8d + 1) — (5d? — 2d + 1)r 
where d is the rate of duplication=n/N, and r is the sampling rate 
=N/P. 


EXAMPLES OF THE EFFECT OF CARD-DUPLICATION ON 
THE VARIANCE OF THE SAMPLE MEAN 





(1 — 1) (4) 








Rate of Duplication 














Sampling 
en 20% 334% 50% 663% 
Relative Information* (percent) 
OT 90 .00 88.89 90 .00 92.59 
.O1 §9.55 88.39 89.55 92.26 
.05 87.69 86.36 87 .69 90.82 
.10 85.26 83.72 85.26 88.93 
.50 60.00 57.14 60 .00 67 .57 
-90 16.36 14.81 16.36 21.37 
* Relative Information =< = x100. The Relative Loss of Information due to card-duplication 
may be computed by subtracting the appropriate J value from 109%, i.e., the Relative Loss of Infor- 
mation = 100% —~ 2" x10. 
Var m 





+ A sero sampling rate corresponds tc the case of sampling from an infinite population, or sampling 
with replacement. 


When r=0, the condition of sampling with replacement holds, and 
I=(1+d)?/(3d+1), a function of the rate of duplication only. The 
table exhibits a few examples of the effect of card-duplication on the 
variance of the sample mean, m, for several sampling rates. The figure, 
exhibiting the graph of J as a function of d for various values of r, dem- 
onstrates the slight losses in efficiency due to card-duplication when the 
sampling rate is small. It also points up the shallowness of these curves, 
ie., the insensitiveness of I to changes in n within a broad interval 
about the critical value, n= N/3. 

When card-duplication is applied in the case of a stratified sample, 

LNitni 


~ ee 5 


2 R.A. Fisher, Statistical Methods for Research Workers, Tenth Edition. Edinburgh: Oliver and Boyd, 
1948, Section 55. 














776 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


L IN; + ni? 
Var m = > (-= ) Var m, 
n 





t=] 


~ o;? P;—n; 
=> W? {1 


i=l (Ni; + n;)? i P;-1 








P;- (N; — ni) 
+ (N; — n:) aa § (6) 


where m; is the sample estimate of the mean in the 7th stratum, N; and 
n; are as before, N= >_.Ni, n= Doni, and Wi;=(Ni+n:)/(N+n). Com- 
paring expressions (1) and (6), we can see that Var m—Var Z is 0 for 
Effect of Weighting by Card - Duplication: 
Relative Information of the Weighted Sample 


Relative 
Information 








30 Ff 


7Or 





10+ 


- 33% 


o 1 1 s. 4 4 i ob 











— 
co) 40 20 30 .40 $0 60 70 80 90 100 


Rote of Duplication 


ere Ye sompling rete 








(6) 


and 


for 


le 





WEIGHTING BY CARD-DUPLICATION 777 


nj=0 or N;. The least efficient case is the limiting one for which n; 
=N;/3. In this case, 


Ci 


Var m = 1.125 Var? + 0.5), W? 7 
t=] 


The increase in sampling variance for the stratified sample due to 
card-duplication is a resultant of the strata increases. Let the symbol 
for this increase be C”. 

In any practical operation it might be asked whether C? is larger or 
smaller than the (bias)? due to incorrect strata weights that the weight- 
ing procedure is designed to remove. 

If w; are the incorrect strata weights and W; the correct ones, the 
bias due to the use of incorrect weights may be expressed as 


B,? = { D _ Woah (7) 


It would seem then that weighting by card-duplication is useful only 
when B,?>C*. On the other hand, nothing is to be gained and some- 
thing may be lost if B,? $C’. 

This indicates that if the method of card-duplication is used to ob- 
tain estimates from a disproportionate sample, then whatever gains 
that might have been expected from the disproportionality could be 
seriously reduced by this weighting procedure. In other words, the 
gains due to the sample design must be at least large enough to over- 
come the loss of efficiency due to the weighting scheme. 











THE MATHEMATICAL BASIS FOR THE BEAN METHOD oF 
GRAPHIC MULTIPLE CORRELATION* 


Ricuarp J. Foote 
Bureau of Agricultural Economics 


N 1929, Louis H. Bean published an article describing a graphic 

method of multiple correlation which subsequently has been widely 
used, particularly in the field of agricultural economics.' In the late 
1930’s, considerable controversy arose among users of the method. 
This controversy in part concerned the correct interpretation of the 
results obtained in terms of standard mathematical coefficients. To 
clarify these aspects, the writer, with J. Russell Ives, published in 1941 
a@ paper outlining in detail the relationship of the graphic method to 
the mathematical method of least squares.? The mathematical proofs 
in that paper were developed with the assistance of M. A. Girshick, 
then on the staff of the Bureau of Agricultural Economics. The ma- 
terial presented here is based mainly on that given in the 1941 paper, 
but includes certain closely related aspects not published at that time. 
Attention is concentrated on the relationship of the graphic method to 
the mathematical method of least squares. Adequate descriptions of 
the mechanics of the graphic method are available in a number of pub- 
lications.* 

Relationships between the graphic method and the least squares 
method can be explained most effectively in terms of the simpler cases, 
especially three-variable linear multiple regression. In such cases, if X, 
is the dependent variable and X2 and X; are independent variables, 
when 123 is equal to zero, each partial regression coefficient is identical 
with the corresponding simple regression coefficient. When r2;3 is con- 
siderably different from zero, the device of “drift lines” (to be discussed 
later) facilitates estimation of first approximations to by2.3 and bi3.2 which 
are superior to b;2. and b,3. Successive transference of residuals leads to 
lines with slopes approximately equal to the mathematically-calculated 
values of bi2.3 and bi3.2, but the process is slow when rz; is considerably 





* This material was prepared for presentation at the 1952 meetings of the American Statistical 
Association. Due to certain unavoidable complications, the session on graphic correlation was not 
held. As the paper by Foote and Ives referred to in footnote 2 was issued only in mimeographed form 
and is now available only in libraries, it appeared worth while to publish this as a journal paper. 

1 Louis H. Bean, “A simplified method of graphic curvilinear correlation,” Journal of the American 
Statistical Association, 24 (1929), 386-97. A mimeographed publication containing essentially the same 
material was issued by the Bureau of Agricultural Economics. 

2 “The relationship of the method of graphic correlation to least squares,” Statistics and Agriculture 
No. 1, U. 8. Bureau of Agricultural Economics, 1941. (Processed.) 

3 See for example Bean, ep. cit., or Thomsen, Frederick Lundy and Foote, Richard Jay, Agricultural 
Prices, McGraw-Hill Book Co., New York, 1952, pp. 296-310. 


778 





oO Se eS lO” 


hiv 
lely 
late 
od. 
the 


4] 


ofs 








BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 779 


different from zero. One problem which tends to slow up the speed of 
convergence is the inability of the research worker to draw least 
squares regression lines accurately. There is a common tendency to 
draw them too steep. Another problem is due to the fact that the itera- 
tive process converges more rapidly if measured by the mean square 
residual than if measured in terms of a particular partial regression co- 
efficient. Thus, in difficult cases, the first several rounds of the iterative 
process may yield a good approximation to R:.23 but poor approxima- 
tions to the regression coefficients bi2.3 and 513.2. These points are dis- 
cussed in more detail in the remainder of this paper. 


MATHEMATICAL MEANING OF THE DRIFT LINES USED IN 
THE GRAPHIC ANALYSIS 
If, in the equation 
X1 = bi + beXe + O:X3 + +++ + pXz, (1) 


constant values are assigned to X3,---, Xp», then b:X3+ --- +b,X, 
is equal to some constant that can be combined with the constant b 
to give a new constant K. Equation (1) can then be written as 


Xi = K -t. beXo, (2) 


which is the equation of a straight line having a slope equal to be. Here 
bs, which may be written as bi2.34...p, is the regression of X; on X2 when 


X;, - + +, Xp are constant. 
If two or more observations in a scatter diagram of X; on X2 had the 
same or approximately the same value of X;, ---, X,, then an esti- 


mate of bi2.3...» could be obtained by drawing a best fitting line through 
them. If this process were repeated for several groups of points having 
the same X;3, - - - , Xp values, several lines whose slopes are estimates 
of the same partial regression coefficient, bi2.3...p, would be obtained. 
The process of obtaining estimates of bi2.3...p from the slope of these 
lines is equivalent to breaking the total sample into selected sub- 
samples and obtaining from each of these an independent estimate of 
by.3..-p. Such lines are the “drift lines” used in the graphic method. 

The closeness with which the average of these slopes approximates 
the mathematical partial regression line will depend upon the stability 
of the slopes of the individual drift lines. In general, the amount of 
fluctuation that may be expected in the slopes of the drift lines will 
depend on (1) the number of observations and the extent of variation 
in the X2 values on which each is based and (2) the size of the partial 
correlation between X, and X2 when X; is constant. 











780 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 19 


PROOF THAT THE SIMPLE REGRESSION IN THE FINAL CHART WILL 
EQUAL THE PARTIAL REGRESSION PROVIDED THE PARTIAL 
REGRESSION IN THE FIRST CHART HAS BEEN ESTI- 

MATED CORRECTLY (THREE-VARIABLE PROBLEM) 


As it is not obvious that the simple regression in the second or final 
chart of a three-variable problem will equal the partial regression even 
if the drift lines correctly estimate the partial regression in the first 
chart, a mathematical proof is given. Stated mathematically, the prob- 
lem is as follows: if the deviations from the regression in the first chart 
are considered as a new variable V;, so that 


Vi = Xi — bir sX2, 


and if the first approximation to bi2.3 is equal to its mathematically cal- 
culated value for the sample of data, we wish to show that the simple 
regression between V; and X; is equal to bis.2. 

The following symbols, which may be new to some readers, are used. 
as S(X1 — X1)(X2 — X:) — 8X — X1)(Xs — Xs) oad 


a2 = 








13 








N-1 N-1 
S(Xi — Xi)? S(X2 — X:)? ‘ 
ay = ’ = ? C. 
11 eg a22 V-1 e 


where X; is the mean of Xi, etc., N is the number of observations in the 
sample, and S is the sum for all observations in the sample. 
The following transformations can be made. 


Qjq = 882P 12, Qig = 8183713, etc. 
ay = 8’, Qe. = 8,7, etc. 


where 8; is the standard deviation of X;, etc. 
It can be shown that in terms of the standard deviations and corre- 
lations 
81 112 — 113723 
bis.s a oe (3) 


82 1 — 937 
and 


b 8 113 — 112723 
ne Se Here 


4) 
83 1 — 123” ( 


It is desired to determine the simple regression coefficient of V; on X3 
when 











LL 


r final 
l even 
e first 
prob- 
chart 


y cal- 
imple 


used, 


ete. 


1 the 


rre- 


(3) 


(4) 


X3 








BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 781 
Vi = (Xi — Bi) — dies(X2 — X2).4 

Now 

, - SVi(Xs — Xs) a S[(Xi — X11) — bis(X2 — Xz) |(Xs — Xs) 

oa 5 ey x S(X3 — X;)? 


Q13 — dr2.302s 











33 
Substituting the value of bi2.; from equation (3) and simplifying, 


81 113 — 112723 
by,x, = — ———- 
83 1- 1237 


By equation (4), 
by,x3 —- bis.2. 


Except that more algebraic manipulations are required, it is equally 
easy to show that the simple regression on the final chart of a problem 
involving more variables will equal the partial regression Din.23..-n—1, 
provided that all of the other partial regressions have been estimated 
correctly by the use of drift lines. 


MATHEMATICAL EQUIVALENT OF THE PROCESS OF 
SUCCESSIVE APPROXIMATION 


In the following paragraphs, a mathematical iterative or successive 
approximation method for obtaining the least squares regression co- 
efficients is briefly outlined. The notation applies to a four-variable 
problem. 

In the method of least squares, the coefficients b12.34, bis.24, and dy4.23 
are determined by minimizing the quantity 


S(bre.s4, Dis.c4, biacs) = [(X1 — Xi) — div.a(X2 — Xe) 
— bis.u(Xs — Xs) — drses(Xa — Xs) ]?. 


The solution yields the following three normal equations: 


bi2.s4@22 + D13.24423 + 14.2924 = ie (5) 
Bi2.s4@e3 + 13.2433 + di4.23%4 = Ais (6) 
Dio.seeg + D13.24024 + O14.23044 = us. (7) 





4 For purposes of derivation, it is convenient to express X: and X: in this equation in terms of devia- 
tions from their respective means. Actual values, however, are used in the mechanics of the graphic 
method. Since coding by subtraction does not affect the value of a regression coefficient, the proof ap- 
plies in either case. 











782 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 193; 


These equations can be solved by well-known methods, 

In the iterative process, values b‘ 2.34 and b 13.4 are guessed for 
bie.s4 ANd by3.24 respectively in equation (5) and a solution for bys 23 (say 
b44.23) is obtained from this equation. Then b 3.4 and (443 are 
substituted in equation (6) for bis. and by.23 respectively and a second 
approximation for bi2.3 (say b°12.34) is obtained. The values b°);» 34 and 
b‘ 4.23 are substituted in equation (7) for biz. and by4.23 respectively 
and a second approximation to bis. (say 6°13.) is obtained. The val- 
ues b);o 3 and b°);3 4 are substituted in equation (5) for bie. and by, 
respectively and a second approximation to bi423 (say b°14.23) is ob- 
tained. This process is repeated until the coefficients converge to stable 
values. 

The iterative process outlined above is equivalent to the following 
steps: (1) Assign values b(y2.94 and B13 24 to bie.24 and bi3.2 respectively 
in the function f(bi2.s, 613.24, 014.23) defined above. Find that value of 
bis.23 Which makes f(b 12.34, 6“ 413.24, b14.23) & minimum. Let it be b\ 44 93. 
(2) Find that value of bis.3, which makes f(bi2.34, 6“ 13.24, b© 14.23) &@ mini- 
mum. Let this value be b‘)2.34. (3) Find that value of b:3.% which makes 
f(b 12.34, bisz.24, Bb 14 93) & minimum. Let this value be b\) 13 94. (4) Find 
that value of by.23 which makes f(b 2.34, 6° 13.24, 614.13) & minimum. Let 
this value be b‘)4.23, etc. It will be seen that the steps involved in the 
latter process are identical with those for the graphic method involving 
three or more variables, as in each case deviations from the approxima- 
tions to the regression lines for the other independent variables are 
plotted against one of the independent variables and the preceding ap- 
proximation to that regression line is adjusted so that it appears to be 
the line of best fit.® 


PROOF OF CONVERGENCE TO THE LEAST SQUARES VALUES 


The problem of convergence is considered for three variables only. 
The normal equations for three variables are given by 


bio.222 + bi3.2023 
Die.sd23 + b13.20ss 


ai2 


3. 


If the iterative process is performed on these two equations, then the 
Kth approximation to bi2.3 and bi3.2 respectively can be shown to be 
equal to 





5 See Thomsen and Foote, op. cit., pp. 299-304. 








SER 1953 


ed for 
23 (Say 
4.23 are 
second 
34 aNd 
‘tively 
1€ Val- 
d bis x4 
is ob- 
stable 


oWing 
tively 
lue of 
D4 93, 
mini- 
nakes 
Find 
. Let 
n the 
lving 
ima- 
S$ are 
z ap- 
[0 be 


nly. 


the 
» be 





BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 783 


81 


b©)yo.3 = — (rie — Piss + ire” — riares® + — ++ + — Tat2s"*-*) 
82 
+ b 32, 317237%—? (8) 
and 
1 
b® 3.0 = — (ris — Prot es + riste3? — Tretes® + — + + + + rist2s"X—*) 
83 
Se . 
— — by. gr237X-! (9) 
83 
81 
= — (rig — Pos — Fishes? — Piet es? + — ++ + — Pi2t23"X—*) 
83 
+ B43 or93°X—? (10) 


where b'*);..3 and b,..3 are the Kth and Ist approximations respec- 
tively to by.3 and b3.2 and b,3.2 are the Kth and the 1st approxima- 
tions respectively to bi3.2. But 

$1 


be3 = — (rie — Ti3%23 + Ty2f23” — Tisle3° + — + °° ) (11) 
82 


and 


bis.2 = — (ris — Ty2Fe3 + Pighes” — Pires? + — + *: ), (12) 
3 
which can be obtained by expanding the denominator of equations (3) 
and (4) in an infinite series. 
Hence, comparing equations (8) with (11) and (9) or (10) with (12), 
it will be seen that b,».; and 6,3. can be made to approximate bi:.2 
and b;3.2 respectively as closely as desired by taking K sufficiently large. 


SPEED OF CONVERGENCE OF REGRESSIONS 


The speed with which the successive approximations lead to stable 
results is of interest for two reasons: (1) It takes time to make succes- 
sive approximations and the charts become messy after several sets of 
dots have been inserted on them and (2) if the convergence is too slow, 
the analyst may think that no further correction is needed in the line 
with slope b“,..; when in reality its slope is still quite different from the 
mathematically calculated value. 








784 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER i933 


By making algebraic substitutions in equations (8), (9), and (10), it 
can be shown that 


bi2.3 — 6 © 42.3 = 12g?X—2(din.g — B23) and (13) 


8 
bisa — O32 = — - r2a"X—"(bye.3 — 012.3) (14) 
3 


= 1237X—2(big.2 — B39). (15) 


Equation (13) states that the difference between the mathematically 
calculated bi2..3 and any given approximation is equal to a function of 
the correlation between the independent variables times the error that 
was made in the first approximation to b,2.3. It shows that the higher the 
correlation between the independent variables, the slower will be the 
speed of convergence. 

Using the size of the original error (that is bi2.3— 12.3) as a base, it 
can be stated from equation (13) that the percentage of error left after 
the Kth iteration is given by 123" times 100. Thus, if res=0.2, the error 
remaining after one iteration is 4 per cent of the original error and after 
two iterations is 0.16 per cent, while if r2;=0.9, the error remaining 
after one iteration is 81 per cent of the original error. After two itera- 
tions it is 65.61 per cent. 

Equation (13) also indicates the importance of the drift lines, since 
if the error in 6.3 is small, one or two iterations may be enough to 
yield a fairly accurate approximation to the mathematically correct re- 
gressions, but if the error is large and the correlation between the in- 
dependent variables is also large, 6 or 8 or even more successive ap- 
proximations may be required to bring the slope of the regression to 
within 0.1 of the correct value.* It is assumed in the graphic method 
that the successive approximation process is continued until a visual 
inspection indicates that no further improvement is possible. 


SPEED OF CONVERGENCE OF MULTIPLE CORRELATION COEFFICIENT 


The size of the multiple correlation coefficient depends upon the 
size of the deviations (or unexplained variation) from the final regres- 
sions. If the regressions are inaccurate, the computed multiple cor- 
relation coefficient will be inaccurate. In a three-variable problem, if 
Yes is near zero, convergence is rapid and errors in by2.3 and bi3.2 are apt 





§ See Foote and Ives, op. cit., p. 14-18. These equations apply exactly only to a mathematical itera- 
tive process in which an original error is assumed but in which each successive iteration is a mathe- 
matical best fit to the residuals remaining after the previous iteration. The additional error involved in 
the graphic approximation to this is believed to be small except in those cases in which the succeeding 
corrections become too small to distinguish visually. 











(10), it 


(13) 
(14) 


(15) 


atically 
tion of 
or that 
her the 
be the 


ase, it 
t after 
2 error 
1 after 
aining 
itera- 


since 
igh to 
ct re- 
he in- 
€ ap- 
on to 
sthod 


risual 


VT 


| the 
gres- 

cor- 
m, if 
- apt 


itera- 
1athe- 
ved in 
eding 





BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 785 


to be small, but any errors made have relatively large effects on Ri.23. 
When rss is near unity, convergence is slow and errors in biz, and bis.2 
may be large, but these have relatively little effect on R12; or on the 
errors in predicting X,. The same general reasoning applies to problems 
involving more variables. 


INTERPRETING THE CORRELATIONS INDICATED 
iN THE SCATTER DIAGRAMS 


In general, the regression lines obtained in the several charts of the 
graphic method have been interpreted correctly as “net,” that is, par- 
tial, regressions between the dependent variable and the separate in- 
dependent variables. Some confusion has occurred in interpreting the 
correlations indicated by the plotted observations in the scatter dia- 
grams. This point can be cleared up if one is careful to note the exact 
meaning of each of the two variables represented by the horizontal and 
vertical scales of the charts, and considers the “visually indicated” cor- 
relation to be the simple correlation of these two variables. 

In the first chart of a three-variable problem, the two variables repre- 
sented by the vertical and horizontal scales are simply the dependent 
variable and one of the independent variables, X2. Hence, this chart, as 
originally plotted, indicates the simple correlation, ri. 

With respect to the second chart, if X:—b12.sX2 is considered as a 
variable, V;, and the simple correlation between V; and X; is obtained, 
the resulting correlation will be equal to the part correlation i372, as 
defined by Ezekiel.” In this sense we can say that the second chart in- 
dicates the part correlation isr2. Likewise, if X1—b:i3.2X3 is considered 
as another variable, V2, and the simple correlation between V2 and X2 
is obtained, that correlation will be equal to the part correlation 1973. 
If X_. were used as the second independent variable instead of X3, the 
second chart would then indicate .r3. Since the final dot. plotted around 
the final regression line in the first chart give the same result as would 


"have been shown in the final chart had the variables been reversed, 


this scatter represents 273. Similar results are given for a problem in- 
volving more variables. The final dots plotted around the final ap- 
proximation to the regression line in each of the charts represent the 
part correlation between the dependent variable and the respective 
independent variable. 

#& Part correlations as such do not appear to have much meaning in the 
interpretation of an actual problem. However, by making certain sub- 
stitutions in Ezekiel’s formula for part correlation, it can be shown that 





1 Ezekiel, Mordecai, Methods of Correlation Analysis, Ed. 2, John Wiley and Sons, Inc., New York, 
1941, p. 213. 








786 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


713.9 


l— (1 maid 713.9) T23” 





1372” = 


Since, in the denominator of this formula, the quantity 1—r*,3.2 is non- 
negative and less than unity, the part correlation between two vari- 
ables is always equal to or greater than the partial correlation between 
the same variables and the difference increases as re3 increases. Because 
of this relationship, charts indicating the degree of part correlation 
can be used as an indication of the approximate size of the partial cor- 
relation. If the indicated part correlation is low, the partial correlation 
must be low, regardless of the correlations between the independent 
variables, as in no case will the partial correlation be higher than 
the corresponding part correlation. If the part correlation is high, then 
either the corresponding partial correlation is relatively high or the 
correlations between the independent variables are very high. In a 
three-variable problem, for example, if the part correlation was 0.9 or 
above, the corresponding partial correlation would be 0.77 or above 
unless 72; exceeded 0.8. 


DEVIATIONS FROM THE REGRESSIONS 


Some investigators have been puzzled by the fact that the deviations 
from the regression lines in certain charts are exactly equal. If the 
mathematically calculated values for the partial regression coefficients 
are obtained, the deviation of the final dot for any given observation 
from the regression line in each chart will be identical. Likewise, if 
calculated values for the dependent variable are plotted against actual 
values and a line is drawn through the origin with a slope of 1, the 
deviation of the calculated value from this line for any given observa- 
tion will be the same as the deviations discussed above. (The simple 
correlation between these two variables equals the multiple correlation 
for the analysis.) This follows from the fact that the deviation for the 
ith observation in each of these charts (for a three-variable analysis) is* 
given by 


di; —_ Xi — bie.3X2i ens bis.2X i. 


Different degrees of correlation are still indicated by the various 
charts, as the degree of correlation reflects not only the deviations of 
the dots from the respective regression lines but also the relative range 
in the dependent variable involved. This is clear from the definition of 
the coefficient of determination, which is the percentage of variation in 
the dependent variable explained by the independent variable. The 
sum of the squared deviations represents the unexplained variation or 





$ non- 
. Vari- 
tween 
CAUSE 
lation 
I cor- 
lation 
ndent 
than 
_ then 
r the 
In a 
).9 or 
ibove 


BEAN METHOD OF GRAPHIC MULTIPLE CORRELATION 787 


the total variation in X; minus the variation explained. But to trans- 
late this into a correlation coefficient, the total amount of variation in 
the dependent variable to start with must also be known. In the chart 
indicating the degree of multiple correlation, this is the total variation 
in X;. But in the charts indicating part correlations, it is the amount 
of variation remaining in X; after adjusting for the effects of the alter- 
native independent variables. 


EFFECTS OF NOT PASSING THE REGRESSIONS THROUGH THE MEANS 


In the original description of the graphic method no mention was 
made of drawing the regressions through the means of the variables. In 
his original article, Bean stated: “At this point it may be observed that 
the arbitrary placing of the approximation curves without reference to 
the average values of X, and of the other variables does not affect the 
values of X, computed from the curves. For example, had the approxi- 
mation curve in section 1 been placed higher, the residuals in sections 
2 and 3 would have been correspondingly decreased and the curves 
lowered.”*® The truth of this is fairly obvious. 


APPLICATION TO PROBLEMS INVOLVING MORE VARIABLES OR 
CURVILINEAR RELATIONSHIPS 


Most of the mathematical proofs have been given for problems that 
involve three or four variables. The extension to problems involving 
more variables is obvious, although the algebra becomes complicated. 

The graphic method was developed primarily to handle curvilinear 
rather than linear relationships. The proofs given here are in terms of 
linear relationships because the least squares method, as usually con- 
sidered, is applicable to linear relationships or those that can be trans- 
formed to a linear form. Thus, it is easier to show the relationships 
between the graphic and the mathematical methods by confining the 
discussion to the linear case. It has been generally recognized by 
mathematicians and can easily be demonstrated by example that the 
graphic method provides at least a satisfactory method for obtaining 
approximations to the net regression curves when dealing with multiple 
functional relationships, regardless of whether the nature of the func- 
tion is known. The extent to which the graphic method can be used to 
determine the nature of curves for stochastic or probability relation- 
ships will depend mainly on the degree of correlation and the extent to 
which the sample represents the population. As, in most cases, one 
never knows for sure whether a given small sample is representative of 
the population, any user of regression methods must proceed with cau- 





* Bean, op. cit., p. 393. A mathematical proof is given in Foote and Ives, op. cit., pp. 32-33. 





788 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


tion and must subject his final results to common sense and to any other 
outside checks at his disposal. Some users of the graphic method, as 
well as many users of mathematical methods, have assumed that their 
methods, as such, are sufficiently reliable so that outside checks are not 
necessary. 


SUMMARY 


The method of graphic multiple correlation suggested by L. H. Bean 
essentially is based upon three mathematical principles: 

(1) The multiple regression equation becomes the equation of a curve 
when all of the independent variables except one are held constant. In 
the case of linear regression, the curve is a straight line whose slope is 
equal to the partial regression coefficient between the dependent vari- 
able and that independent variable which is permitted to vary. For this 
reason the slopes of the drift lines in the first chart of a three-variable 
analysis indicate the partial regression coefficient. 

(2) If in a three-variable analysis, the true partial regression line is 
obtained in the first chart, the simple correlation between deviations 
from this line and the second independent variable is equal to the part 
correlation (as defined by Ezekiel) and the simple regression is equal to 
the partial regression. Thus, the line obtained in the second chart of a 
three-variable analysis approximates the second partial regression. If 
the degree of correlation between the two independent variables is low, 
the part correlation nearly equals the partial correlation. Hence, the 
scatter in this chart in most cases indicates approximately the degree 
of partial correlation. 

(3) The method of successive approximation outlined by Bean is 
analogous to a mathematical iterative process which converges to the 
least squares solution. Thus, even if an error is made in the first ap- 
proximations to the regressions, succeeding approximations will tend 
to yield more and more accurate results. 

The speed of convergence depends chiefly on the size of the error in 
the first approximation and the size of the correlation between the in- 
dependent variables. The better the first approximation and the 
smaller the intercorrelation, the faster will the process tend to converge. 
The degree of intercorrelation is determined by the nature of the vari- 
ables included in the analysis and hence, once the variables are chosen, 
very little can be done graphically to speed up the convergence. How- 
ever, the accuracy of the first approximations may be greatly enhanced 
by the use of drift lines. 

The same reasoning can be extended to problems that involve more 
than three variables. 





RECENT ADVANCES IN FINDING BEST 
OPERATING CONDITIONS* 


R. L. ANDERSON 
Institute of Statistics, North Carolina State College 


HIS paper discusses various experimental procedures used to esti- 

mate the optimal point on a response surface and to explore the 
nature of the response surface in the vicinity of this optimum. Multi- 
factor experiments were first set up to investigate one factor at a time; 
then Fisher and Yates introduced the complete factorials for field 
experiments, plus confounded arrangements for incomplete blocks 
designs. More recently, fractional replication designs have been intro- 
duced to cut down the size of the experiments. 

Hotelling devised methods of locating the optimal point using a sin- 
gle factor. Friedman and Savage outlined a sequential one-factor-at- 
a-time procedure when several factors are involved. 

Box and Wilson present a method of locating the optimum and of 
exploring the response surface in which many factors are varied at the 
same time. They present the use of the path of steepest ascent to get to 
a “near-stationary” region if the experimenter starts at a point far 
removed from it. When the experimenter is near such a region, they 
present the use of a composite design to estimate quadratic and inter- 
action effects. The nature of the response surface is explored by the 
use of a canonical transformation. 

The usefulness of these sequential procedures in various experimental 
situations is discussed. 


1. INTRODUCTION 


Most experimentation has as its ultimate objective the estimation of 
some optimal response. However, the lack of a simple experimental 
procedure to achieve this objective has resulted in a tremendous num- 
ber of piecemeal experiments, each designed to pinpoint some section 
of the response surface. This paper will discuss some contributions to 
the problem of maximizing a function, 


°°? $(x1, T2,°*"y Lk); (1) 


where y is the expected response and z; the amount of the ith factor 
used in producing y. For example, a quadratic response function might 
be written in this form: 


* Revision of a paper presented at the 1952 annual meeting of the American Statistical Association. 


789 








790 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER i953 


k k k—1 k 
y=Bot+ > bated Bixee? + Dd Di Birias, (2) 
i=] t=] i=l jmi+l 
where £; is called a linear or main effect, 8;; a quadratic effect and 
B:; (tj) an interaction effect. Of course, the optimal response may be 
a minimum (such as with costs), but the procedures are the same for 
determining a minimum as for a maximum. In general the word 
“optimal” will refer to either case. The problem is one of finding the 
level of each factor to achieve optimal y, assuming that the factor levels 
can be continuously varied. The combination of factor levels which 
produces the optimal response will be called the optimal factor combina- 
tion. 
In general, it would be advantageous to know the response function 
itself. For example, most production is carried on for a profit. But the 
optimal factor combination usually will change with a change in the 


factor and product prices. Assume the production function is of the 
form 


Q = Q(X, X2, +++ , x), 


where the parameters of g have been estimated. The profit is then 


k 
«= 9p — > Up, 


t=1 


where p is the price of the product and p; the price of the ith factor. 
Then the static optimal factor combination is given by the solution of 
the k equations 
=  .. P ; Te | (3) 
Ox; Dp 
Of course, the dynamic solution is more complicated, because changes 
in g and the x; can be expected to change p and the pj, especially if this 
product is an important part of the economy. In fact, one might need 
to know the demand function for the product and the supply functions 
for the factors. But the important point to note here is that, once these 
functions have been determined, the determination of the optimum 
requires no more experimentation. 
Experimental procedures for estimating the parameters of a multi- 
factor response function are now being developed. Box [3] discusses 


multi-factor designs for estimating a planar response surface. He shows 
that 





actor. 
ion of 


(3) 


anges 
f this 
need 
‘tions 
these 
mum 


1ulti- 
usses 
hows 


FINDING BEST OPERATING CONDITIONS 791 


(i) when prior knowledge of the response surface exists, the design may be 
rotated to reduce possible biases (e.g. quadratic and interaction), and 
(ii) rotation can be used to eliminate such systematic effects as time trends. 


Box and Wilson [2] describe methods of exploring the response surface 
in a “near-stationary” region. 

Before starting any experimentation to explore the response surface, 
the experimenter must select the factors and the factor levels to be 
used in the experiment. The factors are usually decided on the basis of 

(i) previous experimentation and theoretical study in the field, 
(ii) practical consideration of factors which can be varied in the production 


process and in the experiment, and 
(iii) time and facilities available. 


The selection of the factor levels is usually a matter of judgment on 
the part of the experimenter. He considers the possible range of the 
factor levels and previous experience on the differences in levels needed 
to produce detectable response changes, if such exist. These problems 
are common to all the experimental procedures to be discussed in this 
paper. They are largely non-statistical problems, but the statistician 
should be sure that the experimenter understands the importance of 
selecting the correct factors and suitable factor levels. 


2. FACTORIAL EXPERIMENTS 


In the first multi-factor experiments, a single factor was varied at 
a time. For example with 5 factors, one might plan 5/ experiments, 
in which each of the factors in turn was used at J levels while the other 
4 factors were held at some starting level. Fisher [9] and Yates [22] 
encouraged the use of complete factorials and developed a large num- 
ber of special designs involving them. In a complete factorial, all 
combinations of the factor levels are used, e.g., 5 for the above experi- 
ment. These designs were developed for experiments in which the 
experimental error could not be neglected. In order to estimate the 
magnitude of this error in each experiment, the experiment had to be 
repeated several times, say r. These factorial designs were formed large- 
ly for field experiments in which sequential experimentation would be 
less useful than with laboratory experiments, and the factors were 
often of the discrete type, e.g. varieties or rations. 

Because of the large number of factor combinations required in many 
field experiments, it was felt that some form of incomplete block 
design was needed to reduce the experimental error. This resulted in 
the so-called confounded designs, e.g. with 2*, 3*, 3X2", 3X2, 4* designs. 
These are described by Yates [23]. More complicated factorial designs 





792 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


have been constructed by Nair [17, 18], Bose [1], Finney [8], and Lj 
[16], among others. 

When physical scientists and engineers became interested in multi- 
factor experiments, they found that complete and confounded factorials 
required too many experimental units, especially since the experimental 
errors were often much lower than in field experiments. One method of 
reducing the number of experimental units was to use higher order 
interaction effects to estimate the error and hence avoid repetitions 
of the design. Then Finney [6, 7], Plackett and Burnam [19], Kemp- 
thorne [14], Rao [20], and Davies and Hay [5] developed the fractional 
replication designs, based on using parts of the confounded designs, 
Yates [22] and Hotelling [13] had already mentioned the use of such 
designs. A new approach for continuous factor levels has been suggested 
by Box [8]. 


3. LOCATING THE OPTIMAL POINT USING A SINGLE FACTOR 


Hotelling [12] considers in detail the problem of obtaining a maxi- 
mum response when only one factor is involved, advocating the fol- 
lowing experimental procedure: 

(i) An early speculative study of the problem to indicate the range within which 
the optimum lies. 
This study should also include some good theory to help delimit the 
problem. 

(ii) An intermediate stage to supply estimates of the parameters of the response 

function. 

One might use six equally spaced values within the range in (i) to fit a 
fifth degree polynomial and estimate the optimum point é. If several sam- 
ples are obtained at each point, one can also estimate c. 

(iii) A final experiment. 

Let z measure the deviation from ¢ and assume the true response equa- 
tion is 

f(z) = Bo + Bit + Box* + Bz? + Bezt* +--+. (4) 
Assume f(z) can be approximated by a quadratic equation, so that 3 or 
more values of z are needed. 


The estimates of the parameters in the quadratic equation will 
be biased if 6:3, 8 --~- in equation (4) are not zero. Hotelling shows 
how to allocate N sample values so as to make the cubic bias zero and 
the quartic bias a minimum, assuming the variance is fixed. 

Hotelling briefly studied the case of two factors, for which 6 points 
are needed to estimate linear, quadratic and first order interaction 
coefficients. In order to make the cubic bias vanish, he established that 


(i) no 3 points lie on a straight line, 
(ii) no 4 points lie on 2 straight lines through the origin, 





FINDING BEST OPERATING CONDITIONS 793 


(iii) no 4 points can be vertices of a parallelogram, 
(iv) the 6 points cannot consist of the origin and the vertices of a regular 
pentagon with center at the origin. 


4. A SEQUENTIAL ONE FACTOR-AT-A-TIME DESIGN 


Friedman and Savage [11] described a sequential multi-factor plan 
to locate a local maximum and to describe the response surface near 
this maximum. They wanted to explore the response surface near the 
maximum in order to 


(i) indicate the seriousness of choosing a factor combination somewhat 
different from the maximum in order to protect other qualities than the 
one studied, 

(ii) determine the relative importance of various factors, 

(iii) serve as a stimulus to develop the theoretical nature of the response, 
(iv) indicate the seriousness of a lack of control of factor levels in the produc- 
tion process. 


They reject the complete factorial design because 


(a) the levels chosen may be far from the maximum, 
(b) if one chooses levels too far apart, he may obtain a very superficial de- 
scription of the response surface near the maximum. 


In addition, they point out that the factorial design is essentially a 
discrete level design and does not take account of the essential con- 
tinuity or ordered character of many factor levels. This is tied in with 
the Hotelling results, which show how one can improve the estimate of 
the maximum by choosing the levels at unequal intervals and by using 
a different number of samples for each level. However, the use of or- 
thogonal linear forms simplifies tests of linear, quadratic, and higher 
components when the levels are equally spaced. 

The Friedman-Savage procedure is as follows: 


(i) Use the best estimate of the optimal factor combination as the initial 
one. 

(ii) Order the factors in some manner. The authors do not say how to do this, 
but one might order them according to his estimate of the possible effects 
of changes of each on the final response. For example, if one factor were 
very important, the experiments might not detect differences for other 
factors unless this first factor were near its optimal level. 

(iii) Vary the levels of only the first factor until an approximate optimum 
was located for it. Presumably the Hotelling idea of fitting a polynomial 
would be useful if the levels were continuous. 

(iv) Using the optimal level of the first factor and the starting point of all but 
the second, find the optimal level for the second; proceed in this manner 
until all factors have been investigated. 

(v) If necessary, repeat another round, but start with the set of local optima 
in (iv). 





794 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


(vi) If the changes in the second round indicate a need for further experimenta- 
tion, it might be advisable to proceed on a path defined by the two sets 
of local optima. This is similar to a device used by Box and Wilson, which 
will be explained later. 


Friedman and Savage suggest that the differences between factor 
levels should be reduced as one gets closer to the optimum. This en- 
ables one to map the surface near the optimum. However, if the experi- 
mental error is very large, this error may mask the small response 
differences near the optimum. 

Friedman and Savage made a number of comparisons, showing the 
smaller number of experiments with the sequential plan as compared to 
complete factorials. However, they did not make comparisons with 
fractional factorial designs. If there are many factors, it is possible to 
use a small fraction of the complete factorial without confounding 
main effects and 2-factor interactions with each other. For example, 
1/8 of a 2'° design will enable one to estimate all main effects and 2- 
factor interactions if all 3 and higher-factor interactions are negligible, 
and similarly for 1/9 of a 37 design [see Kempthorne [15], Sec. 21.7]. 
If previous information indicates that certain of the 2-factor interac- 
tions can be neglected, the designs could be even further fractional- 
ized. 


5. A SEQUENTAL DESIGN VARYING MANY FACTORS AT A TIME 


A recent article by Box and Wilson [2] is devoted to the problem of 
determining optimal factor combinations in chemical investigations. 
Their methods also enable the response surface to be described in the 
neighborhood of the optimum. The discussion of these methods in the 
1951 article was condensed for publication. After discussion with Dr. 
Box, this writer believes the following is a correct description of the 
Box-Wilson techniques. 

(i) Conduct some initial experiments in the vicinity of the previously 
known best factor combination. These initial factor combinations prob- 
ably would be based on a complete or fractional factorial design, usually 
of the 2* type. The 2* designs are simple to analyze and interpret and 
give good estimates of main effects and 2-factor interactions. 

(ii) If the main effects in step (i) are large compared to the 2-factor 
interactions (i.e., the response surface is roughly planar in the region 
of these initial factor levels), the experimenter would be led to try new 
factor levels which are changed in the direction of largest response in 
the initial experiments. Box and Wilson explore with new experiments 
the path of steepest ascent (or descent if a minimum is desired), in which 
each factor is varied proportionally to its unit effect in the initial ex- 





FINDING BEST OPERATING CONDITIONS 795 


periments. The procedure of steps (i) and (ii) is repeated until the 
first order effects are small, so that no further progress is possible by 
this method.! The experimenter is then brought to a near-stationary 
region. 

A technique is provided for avoiding gross errors in selecting the 
ranges of the factor levels in the initial set of experiments. 

(iii) When the experimenter has reached a near-stationary region, 
he conducts some additional experiments specifically designed to esti- 
mate the quadratic and interaction effects in equation (2). The 3* 
designs have been developed to do this; however, the size of a 3* experi- 
ment becomes unwieldy for large k. One notes that the 2-factor inter- 
action effects for a 3* experiment can be divided into four groups: 
linear Xlinear; quadratic Xlinear; linear Xquadratic; and quadratic 
Xquadratic. Presumably the latter effects (which are of the fourth 
degree) would be negligible, and perhaps the middle two groups of 
effects (which are of the third degree). Hence one might like to use a 
design which would enable him to estimate only the linear, quadratic 
and linear Xlinear interaction effects [the parameters in equation 
(2)]. 

Box and Wilson’s composite design was developed to accomplish this 
purpose. One form of this design is to add (2k+1) experiments to the 
last set of 2* experiments in step (ii). If we designate the factor levels 
at the center of this 2‘ design as (0,0, - - - ,0), the new factor combina- 
tion would be 


(0, 0, aa , 0); (+a, 0, Saha , 0); (0, +a, sada ,9); iit tae ; (0, 0, lak , ta) 


a can be determined either so that the design is orthogonal (the esti- 
mated effects are all non-correlated) or so that the second order effects 
are estimated with equal precision. If the factorial experiments had 
indicated that the optimum was near one of the corners of the factorial 
design, the center for the composite design could be located at this 
corner. 

(iv) Once the experimenter has obtained rather stable estimates 
of the parameters in equation (2), the optimal factor levels, 


2° = (x1°, a°,>°°, rr°), (5) 


can be estimated. The predicted response for this factor combination 
is 


y? = bo +} Lbia., (6) 


1 When the response becomes almost stationary in the first path, a new set of 2* experiments is 
conducted to determine if the first order effects are small in this new region or if a new path should be 
followed. It should be pointed out that if the main effects had been small compared to the interaction 
effects in the initial experiments to step (i), step (ii) would be omitted. 








796 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


where b; is the estimate of 6; in (2). After shifting the origin of the 
system to x°, the quadratic form (2) can be reduced to the canonical 
form 


9=y9 + DAX? (7) 


where 9 is the estimate of y and X; are linear functions of the z;. These 
X; are the axes of a coordinate system with center at 2°. 
(v) The following tentative conclusions could be made: 


(a) If 2° is not far removed from the experimental center and the ); in (7)are 
of the same sign, the experimenter can conclude that y® is near the true 
optimum response. He probably would conduct several confirmatory 
experiments with factor combinations near z° and then reevaluate 2’ 
until fairly stable results were obtained. 

If one of the )’s is small relative to the others, the response surface has a 
ridge along the corresponding X-axis. Dr. Box states that, “often the 
most important practical problem is to determine the nature of the local 
ridge system.” If a ridge is present, the experimenter can then use as the 
optimal factor combination the one along this ridge which is cheapest or 
easiest to use or the one which produces the optimal response for some 
other characteristic. 

If some of the larger \’s in (7) are of opposite signs, the experimeter is 
at a saddlepoint; Box and Wilson outline additional experiments to use 
in this case. 

If x° is far removed from the experimental center, equation (7) could not 
describe the surface at z°. In this case one would suspect the existence of a 
rising ridge along the axis of X;, say, with a small value of A, in (7). It 
would not be advisable to shift to an origin on X, but near the experimental 
center and obtain as a substitute for equation (7): 


k 
G9 = yy + Bik +X? + DMX, (7’) 
m3 


where y’ is the predicted response at this new origin. The experimenter 
would then explore along the X,’-axis. 


An independent successful use of the Box-Wilson techniques is 
given by Read [21]. 


6. CONCLUSIONS 


This paper has presented some recent ideas on the use of sequential 
methods to estimate optimal factor combinations and to explore the 
nature of the response surfaces in the vicinity of these optima. The 
reader should be cautioned that the success of these sequential pro- 
cedures depends on the following conditions: 

(i) The experiments can be sequentialized. 
(ii) The factor levels can be varied continuously. 
(iii) The experimental error is small and generally well estimated in advance. 


If some of the factor levels are discrete, it may be necessary to locate 





FINDING BEST OPERATING CONDITIONS 797 


an optimal combination of the other factors for each discrete level 
and use the best of the local optima. However, it may be possible to 
find characteristics of the discrete factors which are continuous, for 
example, genetic features of varieties, chemical compositions of differ- 
ent soils, or average educational or economic features of different 
human groups. Hence, one of the objectives in future research may be 
the quantification of qualitative factors. 

The use of sequential procedures in biological and social experimen- 
tation may be limited because of the length of time required to conduct 
the experiments and the presence of large experimental errors. In many 
cases, however, response changes over time can be measured by the 
introduction of additional experimental factors. If such time changes 
can not be estimated directly, it may be necessary to use some control 
factor combinations with every new set of factor combinations. If 
controls are needed in the sequential procedure, it may be more efficient 
to use larger initial factorial experiments. Kempthorne [15] discusses 
the use of fractional factorials in incomplete blocks design. If repli- 
cations are needed to estimate experimental errors, replicate experi- 
ments also can be performed in sequential experimentation. It would 
appear that, even in the biological and social fields, the sequential 
methods discussed here should be useful in planning many long-term 
experiments. Here is a place for coordinated research at several re- 
search centers—to avoid duplications and serious omissions in the 
factor combinations used.? 

Two methods of conducting multi-factor sequential experiments 
have been discussed: the use of one-factor-at-a-time and the Box- 
Wilson procedure of varying several factors at once. Another method 
might be mentioned—a procedure based on the random selection of 
factor combinations. It would be useful to have these three methods 
compared in various experimental situations. This wouid seem to be 
a useful statistical research project. As more response surfaces are 
explored, it will be useful to know how many of them have ridge sys- 
tems. Presumably the one-factor-at-a-time approach will not be very 
efficient in the exploration of a ridge. In particular, this approach 
would not tell the experimenter that the optimal factor combination 
can be located anywhere along this ridge. In most production, several 
responses must be optimized at the same time; hence, a good descrip- 
tion is needed for each response surface, for example, costs of produc- 
tion, yield, and quality of product. 





2 Fisher [10] discusses the use of a sequential experimentation in a genetics experiment and Bross 
[4] in medical experiments; however, neither of these articles is concerned with the estimation of an 
optimal factor combination. 





798 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


REFERENCES 


[1] Bose, R. C., “Mathematical theory of the symmetrical factorial design,” 
Sankhya, 8 (1947), 107-66. 

[2] Box, G. E. P., and Wilson, K. B., “On the experimental attainment of op- 
timum conditions,” Journal of the Royal Statistical Society, Series B, 13 
(1951), 1-45. 

[3] Box, G. E. P., “Multi-factor designs of first order,” Biometrika, 39 (1952), 
49-57. 

[4] Bross, I., “Sequential medical plans,” Biometrics, 8 (1952), 188-205. 

[5] Davies, O. L., and Hay, W. A., “The construction and use of fractional 
factorial designs in industrial research,” Biometrics, 6 (1950), 233-49. 

[6] Finney, D. J., “The fractional replication of factorial arrangements,” 
Annals of Eugenics, 12 (1945), 291-301. 

[7] Finney, D. J., “Recent developments in the design of field experiments, 
III. Fractional replication,” Journal of Agricultural Science, 36 (1946), 
184-91. 

[8] Finney, D. J., “The construction of confounded arrangements,” Empire 
Journal of Experimental Agriculture, 15 (1947), 107-12. 

[9] Fisher, R. A., The Design of Experiments, 1st edition. Oliver and Boyd, 
Edinburgh and London, 1935. 

[10] Fisher, R. A., “Sequential experimentation,” Biometrics, 8 (1952), 183-87. 

[11] Friedman, Milton, and Savage, L. J., “Planning experiments seeking max- 
ima.” Chapter 13 of Techniques of Statistical Analysis, edited by Eisenhart, 
Hastay and Wallis. McGraw-Hill Book Co., New York, 1947. 

[12] Hotelling, Harold, “Experimental determination of the maximum of a 
function,” Annals of Mathematical Statistics, 12 (1941), 20—45. 

[13] Hotelling, Harold, “Some improvements in weighing and other experi- 
mental techniques,” Annals of Mathematical Statistics, 15 (1944), 297-306. 

[14] Kempthorne, O., “A simple approach to confounding and fractional replica- 
tion in factorial experiments,” Biometrika, 34 (1947), 255-72. 

[15] Kempthorne, Oscar, The Design and Analysis of Experiments, John Wiley 
and Sons, Inc. New York, 1952. 

[16] Li, Jerome C. R., “Design and statistical analysis of some confounded 
factorial experiments,” Iowa State College Agricultural Experiment Station 
Bulletin 333, (1944). 

[17] Nair, K. Raghavan, “On a method of getting confounded arrangements in 
the general symmetrical type of experiments,” Sankhya, 4 (1938), 121-38. 

[18] Nair, K. R., “Balanced confounded arrangements for the 5; type of experi- 
ments,” Sankhya, 5 (1940), 57-70. 

[19] Plackett, R. L., and Burnam, J. P., “The design of optimum multi-factor 
experiments,” Biometrika, 33 (1946), 305-25. 

[20] Rao, C. R., “Factorial experiments derivable from combinatorial arrange- 
ments of arrays,” Journal of the Royal Statistical Society Supplement, 9 
(1947), 128-39. 

[21] Read, D. R., “The design of chemical experiments,” accepted for publica- 
tion in Biometrics, (1953). 

[22] Yates, F., “Complex experiments,” Journal of the Royal Statistical Society 
Supplement, 2 (1935), 181-247. 

[23] Yates, F., The Design and Analysis of Factorial Experiments, Imperial Bu- 
reau of Soil Science Technical Communication No. 35, (1937). 





A NOTE ON REGRESSION WHEN THERE IS 
EXTRANEOUS INFORMATION ABOUT ONE 
OF THE COEFFICIENTS 


J. DurBin 
London School of Economics 


1, INTRODUCTION 


UPPOSE we have a sample of n observations corresponding to the 
regression model 


Y =a+ BiXi + B2X2 + «, 


where the n values of ¢ are independent of each other and of the z’s and 
have zero means and variance o*. In addition to this sample we are 
given from outside an unbiased estimate b, of 6, together with an un- 
biased estimate s,? of o;? its variance. What is the best way of using this 
information to estimate 82? 

Situations of this kind arise in econometric work in combining cross- 
section and time-series data. For instance, in a demand study we may 
wish to estimate the price elasticity of demand from a time series of 
observations using at the same time an estimate of the income elasticity 
obtained from a budget survey. 

This problem was put to me when I was a research worker at the 
Department of Applied Economics, Cambridge, by my colleague, 
M. J. Farrell. Later developments were worked out in co-operation 
with Richard Stone, who kindly supplied the data for the numerical 
example. 


2. SIMPLE METHOD 


The simplest procedure is to accept b; as the estimate of 6; and to 
estimate 8. by considering the regression of Y—b,X; on X2. Denoting 
by ¥, 21, 22 the deviations of Y, X:, Xe from their sample means, the 
estimate of B: is 


- (y — byx1)x2 
Do 2? 
p Toy — by > T1%2 
aE 





b. = 





For given by, 


,s %1X2 ; 
EN 


E(bs| b:) = Be — (b: — Ai) 


799 











800 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


Thus bz is conditionally biased. However E(bi)=(:, so that as }, 
varies E'(b2) = Bs», i.e. bz is unbiased. For fixed z’s the variance of by is 


V(be) = s[* Ly — Fits om (b1 — patel 


o* 4g? (Do x22)? 
ene dp Qa 
Le (D2)? 


2 
+ og 17b12?, (2) 





Co 
Do 22" 


where };2 is the regression coefficient of 2; on 22. 

We can compare this with the variance that would have been ob- 
tained if the extraneous information had not been used. In that case 
the coefficients would have been estimated by least squares, the vari- 
ance of the estimate of 82 being 








o2 
D at(1 = 14) 
where r is the observed correlation between z; and z2. Now 
o? rig? rg; >, xy? . 


V(b:) = 





Sad-) Sat-y” Dae 


Thus the diminution in variance due to the use of the extraneous in- 


formation is 
- ] AD 
Lad) 9 Sat’ 


which is always positive if 








o? 
03? < ’ 
* * Dard =) 


i.e. if the variance of the extraneous estimate of f; is less than the vari- 
ance of the internal least-squares estimate of 6:. When r=0, there is no 
improvement, as is otherwise obvious since the estimate of 82 is unaf- 
fected by the extraneous information about §:. 

To estimate V(bz) we need an estimate of o?. This can, of course, be 
calculated from the internal least-squares analysis in the usual way, 
and this will generally give the most convenient estimator to use in 
practice. A slightly more efficient estimate which takes into account 








2 1953 


as b, 


(2) 


ob- 
case 
rari- 


ari- 
af- 
ay, 


in 
int 





A NOTE ON REGRESSION 801 


the information contributed by the external estimate of 6; can, how- 
ever, be obtained as follows: 
> (y — bit: — Brt2)? = DX {(y — bias — boxe) + (b2 — B2)x2}? 

= >) (y — bits — bare)? + (be — Be)? D> 22. 


Taking the expectation of the left-hand side we have 


ED (y — biti — Bate)? = ED {(y -- Bits — Bots) — (bi — Biri}? 
= (n — 1)o? + 0?) a. 


(The factor n—1 occurs instead of n since the observations are meas- 
ured from the sample means.) 


o1?( 2102)? 
- fro 


a m (2). 


E(b2 — 62)? >, r2? = o? + 





Thus, 
(n — 2)o? = E> (y — byt, — bere)? — (1 — r)o,2 >, x17. 


Consequently an unbiased estimate of o? is given by 
s? = — { > (y — biti — bere)? — (1 — r)s,? >, a7}. 
The first term in the bracket may be evaluated by means of the iden- 
tity 
Di (y — bits — bare)? = Do (y — dins)* — b2 DO (y — bits) ae 
= Dy? = 21D ay + by? DO 21? — bs? DO a’. 


Substituting s? and s,? for o? and o;* in (2), we get the unbiased estimate 
of V(b2), 





in 1 
jal» al= ee ee 
+ 3? o — 1)(>0 xz) -¥ ot |. (3) 





Ey 


Unfortunately this is not distributed as a multiple of x? in the normal 
case, and cannot therefore be used to construct an exact ¢ test of be. 
For sufficiently large samples an approximate test may be obtained 
by regarding (b2—6:2)/+/V (bz) as a normal variable with zero mean 





802 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


and unit variance. Alternatively, a better approximation could be con- 
structed by the method proposed by Welch [3]. 
Similar results are found for the general regression 


Y = at BX + B2X2 +--+ + BXe + €, 


where we have an extraneous estimate b, of 6;. Let 82 denote the vec- 
tor {B2,---, Be}, y, x: the vectors of deviations from sample means 
of Y, X;, and let X, denote the matrix of deviations from the sample 
means of X2, ---, Xx. Then the estimate of 8: obtained by straight- 
forward substitution of b; for B; is 


be. = (Xo! X2)—"X2'(y — dix). 
The variance matrix of this set of estimates is 
V (be) = 0?( Xo! X2)—! + 0?( Xo! X2)—1Xo'x x1’ X2(Ko’ Xo)! 
o?(X2'X2)—! + o*bibr’, 


where b;2 is the vector of sample regression coefficients of x; on 22, - - - 
z,. An unbiased estimator of o* is given by 


1 
= —— [> &Y — bin: ++ — bax)? — (1 — Rv*)s:? Do 217], 


where R,2 is the multiple correlation between 2; and 22, - - + , XZ. 
A slightly more efficient estimator of the multiple correlation of y on 
z, and x2 than that given by least squares is R defined by 


(n — k + 1s? 
Ly? 


3. EFFICIENT METHOD 





1- R= 


The above procedure, though simple and direct, does not make the 
most efficient use of the available information, since no attempt is 
made to improve the estimate of 8; by means of the multiple regression 
data. The question of efficiency can be explored by considering the fol- 
lowing general problem. 

Suppose that b, is a vector of unbiased estimators of a set of param- 
eters 6i={f1,---, Br} and that bz is an independent vector of un- 
biased estimators of the extended set 6={(:,---, &} where k2h. 
The variance matrices V(b;)=V; and V(b:)= V2 are assumed to be 
known and of ranks h and k respectively. We seek the best unbiased 
linear estimators of 61, - - - , Bx, ie. those which are linear in the ele- 
ments of b; and bz and whose variances are not greater than the vari- 
ances of any other unbiased linear estimators. 





A NOTE ON REGRESSION 
Now 
I, 0 
ho |g, 
® hu 


where J, Ix, are unit matrices of orders h and k—h, and 0 represents 
any matrix, all of whose elements are zero. Also 


bi Vv, O 
v| ‘| =v sy = [ |. 
b 0 Vv. 


We now apply Aitken’s [2] extension of Gauss’s least-squares theo- 
rem. This states that if E(x)= Pa and V(x)=V, P being a known ma- 
trix, then provided that V- and (P’V~'P)~ exist, the vector of best 
unbiased linear estimators of the elements of a is (P’V-'P)P’ Vx. 

In the present problem, 


0 


WwW, 0 
Vi 0 WwW, 0 
Vo = = = 1/0 Wa We 
0 1 fone 0 W: 0 


Wr, Wu 


say, Where W2; is the matrix formed by the first h rows and columns of 
W:2; Wee, We, and Wx are defined similarly, also 


I, 0 
P=|I, 0 
0 


0 


W, Wa ry 


We; Wu 
so that 


W, + Wa Py, = Wi" + W, 


P'V-1P = | 
Ws; Wu 


where W,* is the kXk matrix in which W2 occupies the leading posi- 
tion, the remaining elements being zeros. Similarly 


P'V-1x = W,*b;* + Webs, 





804 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


where b,* is the kX1 vector in which the first h elements are the ele- 
ments of b,, the remainder being zeros. 

Hence, applying Aitken’s result, the best unbiased linear estimators 
of the elements of § are given by 


b= [wW,* + W:|-! [Wi*b,* + Wb: |. 


The inverse matrix [W,*+W:]— exists since W,* and W: are positive 
semi-definite and positive definite and hence Wi* + W, is positive 
definite. The normal equations therefore take the form 


[W.i* + Wo]b = Wi*b,* + Webo. (4) 
The variance matrix is 
V(b) = [Wi* + W2}-. (5) 


These results are of particular interest since they illustrate the straight- 
forward fashion in which Gauss’s theorem may be generalized to deal 
with the estimation of parametric vectors rather than scalars. 

In the application to the problem considered above, b; consists of 
the single element b; with variance o;?. Let bz denote the vector of least- 
squares estimates of the 6’s obtained from the regression observations 
by ignoring the extraneous information, i.e. b2=(X’X)"X’'y, where X 
stands for the matrix of deviations from the sample means of X;, - + -, 
XxX ke Then 


V(b:) = 0°(X’X)-', 
so that 


1 
W, = — X’X, 
o? 


1 
Web: 2 <= X’y. 
o? 


Thus the normal equations (4) take the form 
yf b; 
1 1 a 1 1 
—]0 0--- | +—X’X!6 = —| 0/+— xX’ 
] ie € - ; - 


01 o;” 


where for convenience we write 6 for b, i.e. 





A NOTE ON REGRESSION 805 
A(t? +) + bed ite + ++ + Be DD tite = Do ty + Ar 


o> L1X2 + Bd ry? +-:- + Bed Lot = Do ty (6) 


A>: Lilk + p>: Leite tess + > x” = S aw, 


where 


The variance matrix is, from (5), 


> 21? +A, > m22,---, p> LiXk 
) %1%2, } x2? ¥ 


V(6) = o? 








on Ee Sx? J 


The only difference from the ordinary least-squares expressions is the 
addition of \ to >-2:*. Thus if \ were known it would be no more diffi- 
cult to perform the efficient analysis than the least-square analysis. 
The difficulty is, of course, that in practice \ will not be known. It may, 
however, be estimated by beginning with a least-squares analysis of 
the regression data to get an estimate of o*. Confidence limits can be 
put on d by the ordinary variance-ratio technique. The increments 5b 
to be added to the least-squares estimates may then be found from the 
equations 


5bi( >> x;? + d) + 5b: >> T1722 + Ss + 8b, >, 1t, = A(bi _ by’) 


bb: Du L122 + bbe >) a? +--+ + bbe DU Let, = 0 7) 


bbe > ate + -++ +d Dm? = 0 


where },’ is the least-squares estimate of §,, as can be seen from (6). 
The calculation of the increments for the upper and lower conficence 
limits of \ will give an idea of the sensitivity of the estimates to varia- 
tions in X. 

It is worth noting that the efficient estimate of 8: may be calculated 
directly without solving the equations (6). This is done by taking the 
weighted mean of b; and the least-squares value b,’, the weights being 
the reciprocals of the respective variances. Thus 





806 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 
bi by’ 


2 12 


o1 


1 1 


o;? o;”? 


o1 





, (8) 


where o;’*=V(b;') and is the top left-hand element of the variance 
matrix o?(X’X)-. This value will be found to satisfy the equations 
(6). In practice o? is unknown, but an unbiased estimate of it can be 
calculated in the usual way from the least-squares analysis. 

In any particular case we must therefore decide whether to calcu- 
late 6, - - - , 8, simultaneously as the solution of (6) (or, equivalently 
of (7)) or alternatively whether to calculate #, from (8), the remaining 
coefficients being obtained from the last k—1 equations of (6). If X’X 
has been inverted during the least-squares analysis, so that o” or its 
estimate is known, it will obviously be easier to calculate A: directly 
from (8). If, on the other hand, the least-squares normal equations have 
been solved without inverting X’X it will be easier to solve (6) or (7) 
rather than invert X’X first. 


4. NUMERICAL EXAMPLE 


The foregoing results will now be illustrated by means of data on the 
consumption of pork in the United Kingdom 1920-38. We wish to fit 
a regression of the form 


Y = a+ BiX1 + B2X2 + «, 


where Y is log consumption of pork per head, 
X; is log income per head, 
X > is log relative price of pork, 
f: is the income elasticity, and 
B2 is the price elasticity of demand for pork. 


The analysis of a set of family budget data! yields the value }; 
=0.575798 as the estimate of 6:1, with an estimated variance of 
0.0558764. From the annual figures of aggregate consumption, income, 
etc. we obtain the values of sums of squares and products of deviations 
from the means, 





1 For further information about the data see the article by Stone [2]. One point that may be men- 
tioned here is that the time-series observations were transformed by taking first differences before 
calculating the sums of squares and products, in order to reduce the effect of serial correlation. 





A NOTE ON REGRESSION 
> y? = 0.0352895 > zy = — 0.0023041 
> 2x? = 0.0086619 >> xy = — 0.0235779 
> x2? = 0.0410816 > 122= 0.0070014. 


A least-squares analysis of the time-series data gives the estimates of 
Bi and Be 
bi’ = 0.229520 and b&b’ = — 0.613043, 


with estimated variances, 
V(b:’) = 0.190701 and V(b2’) = 0.040208. 

For the simple method described above we accept the value bh 
=0.575798 as the estimate of 6; and take as our estimate of 82 the value 
given by (1), i.e. 

— 0.0235779 — (0.575798) (0.0070014) 
0.0410816 





— 0.672060. 


For the estimate of variance of b2 we need 
> (y — bits — bers)? = 0.035295 + (0.575798) { — 2(— 0.0023041) 
a. (0.575798) (0.0086619) } — (0.4516646) (0.0410816) 
= 0.0222595. 


Substituting in (3) we have for the unbiased estimate of the variance 
of be, 





m 1 
V(b:) = | 0.022595 
16(0.0410816) 


17(0.0070014)? 
0.0410816 





+ (0.0558764) { - 0.00s66ia} | 


= 0.0348527. 


This may be compared with the figure of 0.0362922 obtained by sub- 
stituting the least-squares estimator of o? together with s,? for o;? in (2). 
The apparent reduction in the variance is of course due simply to the 
use of a more efficient estimate of o?; the actual variance is unaffected. 

To calculate the efficient estimates of 6; and 62 we need first an esti- 











808 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


mate of ». The estimated residual variance of the time-series data js 
0.00142427, whence 
0.00142427 


\ = ————_ = 0..0254897. 
0.0558764 


Substituting in (6) we have the equations 
(0.0086619 + 0.0254897), + 0.00700144: 
= — 0.0023041 + (0.0254897) (0.575798), 
0.0070014f, + 0.04108168. = — 0.0235779, 

which yield the estimates 

B: = 0.497328 and #, = — 0.658687. 
The estimated variance matrix is 
r §0.0341516  0.00700147-" 
| 0.0070014 0.0410816 
r 0.0432143 —0.00736497 
| -0.0073649  0.0359245]° 


0.00142427 








Thus the estimated variances of the estimates of 2 given by least 
squares, the simple method and the efficient method respectively are 


0.040208, 0.034853 and 0.035924. 


These values illustrate the gains achieved by the methods developed 
above, in comparison with the least-squares method. As it happens, 
the estimated variance given by the simple method is smaller than that 
given by the efficient method; this is presumably due to sampling fluc- 
tuations. 


REFERENCES 


{1] Aitken, A. C., “On least squares and the linear combination of observations,” 
Proceedings of the Royal Society of Edinburgh, 55 (1935), 42-48. 

[2] Stone, R., “The demand for food in the United Kingdom before the war,” 
Metroeconomica, 3 (1951), 8-27. 

[3] Welch, B. L., “The generalisation of ‘Student’s’ problem, when different 
population variances are involved,” Biometrika, 34 (1947), 28-35. . 





A HOLLERITH TECHNIQUE FOR THE SOLUTION 
OF NORMAL EQUATIONS 


M. J. R. Hearty anp G. V. Dyrxe 
Rothamsted Experimental Station 


N THE critical study of the results of large-scale sample surveys it is 

frequently necessary to consider data classified in several different 
ways, and to attempt to disentangle the effects of the various classifica- 
tions which will usually not be orthogonal to one another. One way of 
doing this is to fit constants by least squares, assuming the effects im- 
plied by the classifications to be additive; for a discussion of the meth- 
od, see Yates [6, p. 137 et seg.]. The process of fitting involves the solu- 
tion of the normal equations, a set of simultaneous equations equal in 
number to the total number of categories in all the classifications. If 
this number is at all large, the computations become very lengthy, 
and it is desirable to use Hollerith machinery, more especially as the 
main computations of the survey will often be done on Hollerith ma- 
chines, and the data will already be punched on cards. 

At least two methods of solving simultaneous equations with the aid 
of Hollerith machines have been published [3, 4]. Both employ a tech- 
nique of pivotal condensation, and demand the use of a range of 
machines outside the scope of a small installation. In the present con- 


text the large number of equations may lead to a serious accumulation 
of rounding errors, and there are advantages in the alternative tech- 
nique of successive approximation, as described, for example, by Stev- 
ens [5]. The present paper gives a method of mechanising this tech- 
nique, using only the basic Hollerith machines, the sorter and tabu- 
lator. For producing the working pack, a reproducer is desirable though 
not essential. 


THE METHOD OF SOLUTION 


The method of solution will be illustrated on a small scale by means 
of an example with three classifications used by Stevens [5]. The com- 
putations will be set out in some detail, and the process of mechanisa- 
tion can then be briefly explained. 

The necessary data, abstracted from the complete table given by 
Stevens, is shown in Table 1. Here we have 2-way tables showing the 
number of units in the various sub-classes, and the total number of 
units and total “yield” for each category of the three classifications. 
From this table, the normal equations (Table 2) can be written down 
immediately; the diagonal terms come from the marginal totals, the 


809 





810 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 
TABLE 1 








Litter 
Totals Yield 





2 3 








7 23 13 
602 1982 886 











TABLE 2 








+ 344+ 2h42/3+2),+ 68:4+ 338,= 572 

+ 34+ 324+2/);+ 4+ 681+ 3s,.= 748 

9d; ad 24+ 3l. +21; +21,+ 581+ 43. = 733 
9d,+ 2h+ 3l, +21; +21, + 63, + 332 = 815 


3d, +3d, +2d; +2d,; +101; 4. 781 + 382 = 734 
2d; +3d, +3d; +3d, + 1 1l, + 63; 4 582 = 819 
2d, +2d, +2d; +2d, +8i; + 68; aa 282 = 713 
2d:+ d2+2d3;+2d, +714,4+ 48:+ 38:= 602 


6d, +64, +5d;+6d,+ 71+ 6l2+61; +41,+23s; = 1982 
3d, +3d2+4d3+3d,+ 34+ 512+21;+3% +13s2= 886 





TABLE 3 








ly ls 





333 
-333 








SOLUTION OF NORMAL EQUATIONS 811 


other coefficients from the body of the table, while the right-hand sides 
of the equations are the total yields. The basic problem is the solution 
of these equations. Notice that they are not independent; in fact what 
we determine are the differences between diets, between litters and be- 
tween sexes. 

The first step is to divide each equation by its diagonal term, obtain- 
ing the coefficients set out in Table 3. These equations with rounded-off 
coefficients are those which we actually solve, and this table eventually 
serves as & punching schedule for the working pack. As first approxima- 
tions we take the straight means of each category, which appear as 
the right-hand sides of the equations in Table 3. However, as only 
differences are to be estimated, we can add or subtract any convenient 
quantity from each set of values, and in practice we subtract the small- 
est value in each set from the others. Thus, subtracting 63.56 from the 
d’s, 73.40 from the l’s and 68.15 from the s’s and retaining 3 significant 
figures, we arrive at the first column of Table 4. Inserting these ap- 
proximations into the first four equations, we obtain improved values 
for the d’s, as follows:— 

d, ~63.56 — (.833 X0.0 +.222 X1.0+ - - - +.333 X0.0) =45.0494 
dz ~83.11 —(.333 X0.0 +.333 X1.0+ - - - +.333 X0.0) =65.8870 


d; ~81.44 — (.222 X0.0+.333 X1.0+ --- +.444X0.0) =64.8164 
d, ~90.56 — (.222 X0.0 +.333 X1.0+ - - + +.333 X0.0) =71.9384 


A 


We subtract 45.0494 from each of these new approximations, round off 
to one decimal and use them in the next set of equations to get im- 
proved values of the l’s— 
l, 73.40 —(.300 X0.0 +.300 X20.8+ - +--+ +.300 0.0) =45.2200 
l, 74.45 —(.182 X0.0 +.273 X20.8+ --- +.455 X0.0) =46.2125 


ls 89.12 —(.250 X0.0 +.250 X0.0 +--+ +.250 0.0) =58.7450 
1, ~86.00 — (.286 X0.0 +.143 X0.0 +--+ +.429X0.0) =59.3914 


B 


Subtracting 45.2200 and rounding off, we go on to the next two equa- 
tions— 


8, *86.17 — (.261 X0.0 +.261 X20.8+ +--+ +.174X14.2) =63.1684 Cc 
8: 68.15 — (.231 X0.0 +.231 X20.8+ +++ +.231 14.2) =45.2887 


We thus arrive at the set of second approximations in Table 4. Repeat- 
ing the whole cycle we obtain 3rd approximations, and these are found 
to be unaltered by further cycles. 

Solutions correct to three figures have now been obtained and in view 
of the rounding off of the coefficients no further accuracy can normally 
be achieved by this technique without modification; three figures will 
in any case be sufficient in sample survey work. It is convenient to 
make final adjustments to make the mean of each group equal to the 








812 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 














TABLE 4 

Approx. Stevens’ 

Ist 2nd 3rd solution solution 

d, 0.0 0.0 0.0 62.7 62.665 
d, 19.6 20.8 21.0 83.7 83 .678 
d; 17.9 19.8 19.8 82.5 82.426 
d, 27.0 26.9 26.9 89.6 89.555 
h 0.0 0.0 0.0 72.4 72.399 
ls 1.0 1.0 1.0 73.4 73.391 
ls 15.7 13.5 13.5 85.9 85.949 
ly 12.6 14.2 14.2 86.6 86.594 
8 18.0 18.0 17.9 88.5 88.505 
82 0.0 0.0 0.0 70.6 70.662 








general mean, and these figures are given in Table 4 together with the 
solution obtained by Stevens. 


THE HOLLERITH TECHNIQUE 


It is apparent from the cycle of iteration set out in full above that the 
basic operation is the computing of sums of products, Sry say, where 
the x’s stay fixed throughout the problem (they are in fact the coeffi- 
cients in Table 3). It is natural therefore to punch these quantities on 
cards. The actual multiplications are done by successive addition, as 
on a desk calculator. Thus if 123 is to be multiplied by 456, we pass 
through the tabulator 4 cards punched 12300, 5 cards punched 1230 
and 6 cards punched 123. 

The construction of the working pack will now be described in detail. 
One set of 27 cards as described in the previous paragraph is used for 
each column of table 3, that is, for each variable, the units in the di- 
agonal being omitted. Leaving columns 1-5 for indicative material, 
the basic pack will be punched as follows, (- denoting a blank column) 




















Col. no. 6 10 15 20 25 30 
Card no. la ----- j-----!-----!----- 300--:182 ete. 
et en rd 300--;273 
Ba - - - - — f- - - - 1 - - - - -f - 200--:273 
4a ----- }-----!-----}----- 200--:273 
Sa 333--1333--!}222--3;222--;----- --- 
etc. 


Each of these a cards is copied a further eight times. Nine more cards 
are then punched for each variable, with the information in cols. 6-78 
transferred to cols. 7-79; these will be referred to as cards 1), 2b, - - - 








1953 


——e 
— 


the 


the 


ail. 
for 
di- 
al, 
1n) 


tc. 





gOLUTION OF NORMAL EQUATIONS 813 


A set of c cards is similarly punched with the information appearing in 
cols. 8-80. 

The indicative matter in cols. 1-5 is used for sorting, controlling and 
checking. All a cards are punched X in col. 4, b cards are punched 1 
in this column, and c cards are punched 1 in col. 5. By leading these 
columns to a counter and using the “29 feature”! a check can be made 
that the right multipliers are used at each stage. Column 3 is not 
needed in the present example, but in almost all practical cases the 
number of equations will be such that two or more cards will be needed 
for each variable; these can be distinguished by punching in this col- 
umn. 

To form the multipliers, the correct number of each type of card 
are picked out by hand from the pockets of the sorter. Using always 
3-figure multipliers, the 12 pockets of the machine will hold 4 variables 
at a time, so that all cards 1-4 are punched 0 in col. 1, all cards 5-8 
are punched 1, and so on. Control is made possible by over-punching 
XY, X, Y or nothing to distinguish the variables in each set of four, 
these punches being ignored by the sorter. The punching in col. 2 is 
designed to bring the cards into the sorter pockets in the proper order, 
thus 
Punch in col. 2 e- 82 4 @&2& 24 Bid -O°aaewe 


Cards la 1b le 2a 2b 2c 3a 3b 3c 4a 4b 4c 
5a 5b 5c 6a 6b etc. 


To start the solution, the first approximations are calculated (Table 
4, column 1). The cards are sorted on col. 1 and all cards 1-4 removed, 
since they are not needed in approximating to the d’s. Cards 5-8 are 
sorted on col. 2 and picked by hand to give the correct multipliers. 
Reference to the equations marked A above shows that the numbers 
of cards required are 


5a 5b 5c 6a 6b 6c 7a 7b Te 8a 8b 8 
> 8 8 @ 84 @ f2. #&  *@ 24.18 oa 


these numbers are simply the first approximations to the l’s. The re- 
maining cards are sorted on col. 2 and hand-picked in their turn. 

The cards picked out are now tabulated. Cols. 4-5, 6-10, 11-15, 
16-20, 21-25 are plugged to the counters and by controlling on col. 1, 
cols. 4-5 are totalled at the end of each variable to check that the 
hand-picking has been done correctly. The other counters total at the 
end of the run, and the printed record shows 





1 This feature allows the punching of numbers from 0-29 in one column, the “tens” and “twenties” 
being overpunched X and FY respectively. 





814 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


10 
157 
126 
180 185106 172230 166236 186216 


which on subtraction from the right-hand sides gives the second ap- 
proximations found at A. 

The multipliers for the d-variable cards are now known so these are 
sorted and hand-picked. The J-variable cards are not needed at the 
next stage and are removed from the pack. The tabulator counters are 
replugged to cols. 4-5, 26-30, 31-35, 36-40, 41-45 and the following 
tabulation gives : 

208 
198 
269 
180 281800 282375 303750 266086 


which leads to the approximations found at B. The rest of the solution 
continues on the lines detailed in the previous section. 

A slight modification is possible which reduces the effect of rounding 
off the coefficients of the original equations. The coefficients are 
punched in the a pack to 5 decimals, and are reduced to 4 and 3 deci- 
mals for the b and c packs. If these packs are produced mechanically 
on a reproducer, the reduction is made without rounding off (compare 
[4], pp. 162-3). 


THE METHOD IN PRACTICE 


The method has been used in the analysis of two years’ results from 
a survey of maincrop potatoes [2]. There were 30 and 28 constants re- 
spectively representing 6 classifications of which the largest contained 
11 categories. In each case, 3-figure accuracy was attained after 4 cycles 
of the iteration. When some experience had been gained, each cycle 
took about 45 minutes to complete. Preparation of the working pack 
took about 4 hours using a reproducer; this was reduced to little more 
than 1 hour when a summary punch became available, so that the 
equivalent of Table 3 could be punched at the same time as Table | 
was being produced on the tabulator. The complete solution thus took 
about 5-8 hours working time. The same iteration carried out on desk 
machines took about 4 hours for each cycle. 

Mistakes in hand-picking were rare, but it was found worth while to 
use coloured cards for the a cards representing the first figures of each 
multiplier. 

Some difficulty may be caused by highly correlated variables. If two 





SOLUTION OF NORMAL EQUATIONS 815 


such variables are present (that is, if one of the non-diagonal coefficients 
is near to 1) the corrections tend to pass backwards and forwards be- 
tween them showing only slow convergence to zero. There is no par- 
ticular point in continuing the iteration for the sake of these variables 
only, as they will in fact be ill-determined and high precision in the 
solution will only be misleading. 

It is known that for normal equations the process described above 
always converges. Convergence may be slow, however, and Aitken has 
described a technique for accelerating it [1]. Its application is made_ 
awkward here by the fact that the corrections are adjusted at each 
stage. In the two large examples so far attempted, it has been quicker 
to run another cycle or two of the iteration, but in the other cases Ait- 
ken’s method may be useful. His iteration is not quite the one used 
above, but the practical differences are trivial. 

We are indebted to Dr. F. Yates for the original suggestion which 
led to the method set out in this paper. 


SUMMARY 


A method is described for fitting constants to survey data by least 
squares, using a Hollerith sorter and tabulator. 


REFERENCES 


{1] Aitken, A. C., “Studies in practical mathematics, V; On the iterative solu- 
tion of a system of linear simultaneous equations,” Proceedings Royal 
Society Edinburgh, 63A (1949-50), 52-60. 

[2] Boyd, D. A., and Dyke, G. V., “Maincrop potato growing in England and 
Wales,” N.A.A.S. Quarterly Review, 10 (1950), 47-57. 

[3] Fox, L., Huskey, H. D., and Wilkinson, J. H., “Notes on the solution of 
algebraic linear simultaneous equations,” Quarterly Journal of Mechanics 
and Applied Mathematics, 1 (1948), 149-73. 

[4] Hartley, H. O., “The applications of some commercial calculating mechines 
to certain statistical calculations,” Supplement Journal Royal Statistical 
Society, 8 (1946), 154-83. 

[5] Stevens, W. L., “Statistical analysis of a non-orthogonal trifactorial experi- 
ment,” Biometrika, 35 (1948), 346-67. 

6] Yates, F., Sampling Methods for Censuses and Surveys. London: Griffin, 1949. 





THE USE OF RUNS TO CONTROL THE MEAN IN 
QUALITY CONTROL 


H. WEILER 
University of Technology, Sydney, Australia 


For quality control charts controlling the mean of a normal 
population, either small samples are taken out at frequent 
intervals or large samples at less frequent intervals. It will be 
shown that in order to detect small changes of the population 
mean, the amount of inspection is greatly reduced by the 
selection of large samples. However, if for other reasons small 
samples are desirable, a control by runs of sample means 
above or below certain control limits makes it possible to use 
small sample$ and yet maintain the advantage of a reduced 
amount of inspection. For certain types of runs, the sample 
size n =1 turns out to be very economical, so that time saving 
methods of control by gauging may be introduced without 
appreciable loss of efficiency. 


1. INTRODUCTION 


ONSIDER & variate x representing some measure of a mass-produced 
article, and suppose that the production has been brought under 
control. It may then go out of control in three different ways: 

(a) The mean of the population may change, which could happen, for instances 
when a tool setting gets out of position or when a tool wears out. 

(b) The standard deviation of the population may change, which could happen, 
for instance, when a fixed tool becomes loose. 

(c) The population may cease to be homogeneous, that is, elements may appear 
that do not belong to the original population. This may happen, for in- 
stance, when articles are produced by several machines; if one machine 
develops a fault, articles from that machine may be out of control while 
all other articles remain unaffected. 


While for a check on faults of type (a) a control chart controlling 
the mean of the population is most suitable, a control chart for stand- 
ard deviations or ranges is used to check on faults of type (b). For 
faults of type (c) both charts are useful, but it is essential that articles 
be selected from rational subgroups [1, 2]. For instance, if the same 
type of articles are produced by each of several machines, articles may 
be selected from each machine separately in order to allow discrimina- 
tion between the various machines. 

The usual control chart controlling the mean of a population is con- 
structed in the following way: After the mean and standard deviation 
of the population have been reliably estimated, samples of fixed size n 
are selected and their arithmetic means =)_2z/n are calculated. A 


816 





orma] 
quent 
vill be 
lation 
y the 
small 
neans 
(0 use 
luced 
‘mple 
aving 
thout 


USE OF RUNS IN QUALITY CONTROL 817 


chart is then constructed with control limits m+ Bic/+/n, where m and 
sare the estimates of the population mean and standard deviation, and 
B,=3 or 3.09. The various values Z are entered in the chart in chrono- 
logical order, and as soon as one such value falls outside the control 
limits, production is stopped to allow investigation. 

In this paper, we shall investigate the following alternative control 
method. Instead of stopping the production when a single < value falls 
outside the control limits m+ B,c/+/n, we may calculate a pair of nar- 
rower limits m+ B2o/+/n and stop production as soon as two successive 
z values fall above the upper or below the lower of these limits. More 
generally, we may calculate a pair of limits m+ Byo/ /n such that we 
may stop production as soon as ) successive values fall above the up- 
per or below the lower of these limits. In each case, B, is determined 
such that if the population mean does not change, an average of 1000 
samples is necessary to produce one run of \ successive Z values above 
the upper (or below the lower) control limit. Thus, in each case, a false 
alarm can be expected about once in every 500 samples tested. On the 
other hand, if the population mean does change, the amount of inspec- 
tion required to detect a given change will depend on \ and n. 

It has been shown in a previous paper [3] that for \=1 the most eco- 
nomical sample size (that is, that value of n which would lead to the 
detection of a given change of the mean after a minimum of inspection) 
is much larger than the sample sizes usually used in quality control. 
Nevertheless, since small samples lend themselves readily to the detec- 
tion of faults of type (c), the quality control engineer may be reluctant 
to abandon them in favor of larger samples. It will be shown that for 
a check on faults of type (a) the use of runs makes it possible to retain 
small samples without an appreciable loss of efficiency. 

With the exception of a paper by Olmstead [4], in which runs are 
terminated whenever an observation turns out to be one of a specified 
kind, recent publications on runs deal mainly with runs within samples 
of fixed size [5, 6]. In particular, the theory has been applied to prob- 
lems of quality control in the form of runs above the sample median 
[7, 8], and runs up and down [9, 10, 11]. The runs in this paper differ 
from those of the other publications in that they are not related to a 
fixed number of observations. They constitute a test similar to a se- 
quential test [12], where the number of observations is not predeter- 
mined but depends on the outcome of the observations themselves. 
Although little mathematical research seems to have been done in this 
field, the method has been used intuitively by quality control engineers 
(13, 14]. 





818 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


2, DETERMINATION OF CONTROL LIMITS FOR RUNS 
Definition 


Consider a sequence of trials where each trial may or may not pro- 
duce an event E. If a particular trial produces the event E, we shall 
call it a success. A sequence of \ consecutive successes not preceded by 
a success is called a run of X successes. 

We shall make use of the following theorem [15]. 


Theorem 


If p is the probability that a single random trial results in a success, 
then the expected number s of independent trials required to obtain a run of 
d successes is s=(1—p*)/p(1—p). 

Using this theorem, we may solve the following problem. 


Problem 


Let x be a normal variate of mean m and standard deviation o, and 
let €=>_2/n be the arithmetic mean of a sample of n independent z 
values. Every Z is called a trial and every 22m+Ba//n a success, 
Determine B such that in the average, 1000 trials are required to obtain one 
run of X successes. 

Let p be the probability that a random trial z gives 22m+Bo/V/n. 
Solving the above problem for \=1, we have s=1/p= 1000, or p=0.01. 
Since Z is normally distributed with mean m and standard deviation 
o/+/n, we obtain B=3.09 from a set of normal probability tables. 
Thus, if the @ values are entered in a control chart with control limits 
m+3.09c/+/n, and if the population mean and standard deviation re- 
main unchanged, we can expect that an average of 1000 trials will be 
required to obtain one trial above the upper control limit. Similarly, an 
average of 1000 trials will be required to obtain one trial below the lower 
control limit, so that an average of 500 trials can be expected to pass, 
before a “false alarm” or type I error [5] occurs. 

In a similar manner, we may solve the problem for \=2. This gives 
s=(1+ p)/p?=y’?+y=1000, where y=1/p. Solving this equation, we 
obtain y=31.127 and p=0.03213. The normal probability tables 
give B=1.85. Thus, if Z is entered in a chart with control limits m+ 
1.850//n, and if two successive ~ values above the upper or below the 
lower limit are regarded as significant, we can again expect that 500 
trials will pass before a false alarm occurs. 

For \=3, the equation reduces to s=y+y?+y*=1000, which may 
be solved by any numerical method. Using Newton’s method, we find 
easily y= 9.645 and p=0.10368, and we deduce B=1.26. 





uccess, 
run of 


7, and 
lent z 
iccess, 
in one 


/ Vn. 
= 0.01. 
lation 
ables, 
limits 
on re- 
rill be 
ly, an 
lower 
pass, 


UsE OF RUNS IN QUALITY CONTRO! 819 


In this way, we calculate B for \=1, 2, 3, - - - , 9, and obtain the fol- 
lowing values. 


TABLE I 
CONTROL LIMIT FACTORS FOR A=1, 2,--- 








2 3 4 5 6 Y i 





31.127 9.645 5.341 3.742 2.953 2.494 
-0321 - 1037 - 1873 . 2672 - 3356 -4010 
1.05 1.26 0.89 0.26 0.42 0.25 





In each case, if we regard a run of \ values above the upper or below 
the lower control limit as significant, we can expect an average of 500 
trials to pass, before a type I error is committed. 


3. THE AVERAGE AMOUNT OF INSPECTION FOR A GIVEN CHANGE 
OF THE POPULATION MEAN 


let x be a normal variate and suppose that the control limits 
m+Bo/+/n are adopted for the arithmetic mean = )2/n. If the pop- 
ulation mean changes from n= m to n= m-+ko(k>0) while the standard 
deviation « remains constant, the probability that Z exceeds the upper 
control limit is (see also [3]): 


P = Pr {4 > m+ Bo/Vn| un = m+ ko} 
= Pr FP 2B evil n= mt be} (1) 
a/Vn - 
= Pr {z>B—kvn}, 





where z is the standardized normal variate (mean zero, standard devia- 
tion one). 

If we regard a run of \ values above the upper control limit as sig- 
nificant, we shall in the average require S=(1—P*)/P\(1—P) samples 
to detect a change of the mean from p=m to hn=m-+ko. The corre- 
sponding number of articles to be tested is 


n(1 — P*) 


= Pap)’ (2) 


A(n) 


—* f — =" 


/2r 


B-kVn 











820 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


The value of n for which A(n) is a minimum may be found by soly- 
ing the equation dA/dn=0. This has been done in [3] for \=1 and can 
also be done for \=2, 3, - - - , but a direct calculation of A(n) for vari- 
ous values of n is less tedious and more instructive. 


4 





z * * ’ +e ad sa ‘s 


Fia. 1. Average Amount of Inspection for \ =1; B =3.09; n =1, 5, 10, 20, 40. 


4. GRAPHICAL REPRESENTATION AND DISCUSSION 


Equations (2) and (3) show that when n, \, and B are given, the av- 
erage amount of inspection A(n) is a function of k, which can be read- 
ily calculated with the help of a set of normal tables. The variation of 
A(n) as a function of k is shown in Figures 1, 2, 3, 4, for various values 
of n and X. Since A(n) increases rapidly with decreasing k, the use of 
semi-logarithmic paper was found to be convenient. 

It may be seen from Figure 1 that with the conventional control 
chart (A=1), small samples usually require a much greater amount of 
inspection than large samples. In particular, the sample size n=5 is 








ORE OS TRATES ESTE 





R 1953 


soly- 
| can 
vari- 


0. 


av- 
ad- 
1 of 
ues 
. of 


‘rol 
, of 
) is 








sarees 5 dl Lipa mec BDL 





USE OF RUNS IN QUALITY CONTROL 821 


economical only when the population mean changes by more than (say) 
one standard deviation. The sample size n = 1 is particularly uneconom- 
ical unless k is very large. 

Figure 2 gives the average amount of inspection required when two 


1000 


600 


600 
500 


400 


200 


150 


100 


3 


40 


2 


ss 





o & 


4 40 ‘4 


Fic. 2. Average Amount of Inspection for \=2; B =1.85; n =1, 5, 10, 20. 


successive values above the upper (or below the lower) control limit 
are regarded as significant. It shows that here the amount of inspec- 
tion by means of small samples is greatly reduced. In particular, for 
k=0-4 and sample size n=5, the average amount of inspection is 380 
for the conventional chart and only 210 for the chart with \=2. The 
sample size n= 1, although still uneconomical, is more economical than 
with the conventional chart. 

Figure 3 shows that a chart with \=3 represents a further improve- 
ment for small samples, while large samples become uneconomical. 











822 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Figure 4 shows that for \=6 the sample size n=1 is very economi- 
cal. This is an important result, because the high efficiency of a control 
by runs of individual values makes it possible to use gauges where oth- 
erwise costly measurements are required. The loss of efficiency that 
testing by gauges usually entails is here avoided. 


Atm) 
600 
$00 
400 


300 


80 


50 


40 





“0 
v “4 6 sd +o 42 “4 46 


Fic. 3. Average Amount of Inspection for \=3; B =1.26; n =1, 5, 10, 20. 


A similar graph for \=9, B=0 would show that the efficiency of 
sample size n=1 is about the same as for a chart with \=6. The ad- 
vantage of taking \=9 rather than \=6 would be that B is equal to 
zero so that no control limits need to be calculated. 


5. THE CHOICE OF THE MOST SUITABLE CONTROL CHART 


Since for any given value of B, the probability P defined by equation 
(3) is a function of k+/n alone, the expression 


ke/n)*(1 — P» 
KA(n) = $ = (4) 














USE OF RUNS IN QUALITY CONTROL 823 





OmMi- 
ntrol 
oth- 
that 





4 


Fia. 4. Average Amount of Inspection for \=6; B =0.42; n=1, 5, 10. 


of 
vd- 


on 





4 
) Fic 5. Average Amount of Inspection for \=1, 2, 3, 6, 9. 











824 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


remains constant as long as k+/n and ) are kept fixed. For any given 
value of }, this expression is thus a function of the one variable k/n, 
It is easy to calculate the values of the function for any values of k+/n 
and to plot the corresponding curve. This has been done in Figure 5 for 
A=1, 2, 3, 6, 9. The curves show clearly that the conventional chart, 
based on \=1, is economical only when k+/n is greater than (say) 2.5. 

This means that the conventional chart is most efficient in a range 
that is usually of little interest. If, for instance, the sample size n=4 
is used, the conventional chart is efficient only when the population 
mean changes by more than 1.3 standard deviations. A chart with 
\=2, on the other hand, would then be very efficient for changes of 
between 0.8 and 1.5 standard deviations and is superior to a chart 
with \=1 for any change up to 1.4 standard deviations. The saving of 
inspection may be anything up to 40%. 

The saving is even greater (up to 50%) when A=3 is used, but the 
range for which such a chart is most efficient is somewhat reduced. 
When \=6 is used, the saving may be anything up to 60%. However, 
the range of high efficiency is further reduced and the chart becomes 
rather inefficient when k+/n>2. 

The case \=9 is of special interest, because B is equal to zero. This 
means that no control limits need to be drawn. A chart then becomes 
unnecessary, and gauging methods may be adopted when the sample 
size n=1 is adopted. Moreover, any of the above charts may be com- 
bined with the observation of individual articles. Production should be 
stopped to allow investigation as soon as \ successive £ values fall 
above the upper or below the lower limit, or when 9 successive single 
values fall above or below the population mean. 


6. CONCLUSION 


It has been demonstrated that the sequential use of runs for control 
charts controlling the mean leads to great saving of inspection, and 
that it will, in many cases, be of advantage to introduce it instead of 
the conventional chart. The conventional chart is to be preferred only 
when large samples are not a disadvantage or when sequential methods 
are undesirable. 

7. REFERENCES 


[1] A.S.7.M. Manual on Quality Control of Materials, American Society for 
Testing Materials, Philadelphia, 1951. 

[2] Olmstead, P. S8., “How to detect the type of an assignable cause,” Industrial 
Quality Control, 9 (1953), 22-32, 





USE OF RUNS IN QUALITY CONTROL 825 


[3] Weiler, H., “On the most economical sample size for controlling the mean 
of a population,” Annals of Mathematical Statistics, 23 (1952), 247-54. 

[4] Olmstead, P. S., “Note on theoretical and observed distributions of repeti- 
tive occurrences,” Annals of Mathematical Statistics, 11 (1940), 363-66. 

[5] Wilks, 8. S., Mathematical Statistics, Princeton University Press, 1943. 

[6] Mood, A. M., “The distribution theory of runs,” Annals of Mathematical 
Statistics, 11 (1940), 367-92. 

[7] Mosteller, F., “Note on an application of runs to quality control charts,” 
Annals of Mathematical Statistics, 12 (1941), 228-31. 

[8] Shewart, W. A., “Contributions of statistics to the science of engineering,” 
Proceedings of the Bicentennial Celebration of the University of Penna., Phila- 
delphia, 1941. 

(9] Wallis, W. Allen, and Moore, Geoffrey H., A Significance Test of Time Series 
Analysis, National Bureau of Economic Research, New York, 1941. 

[10] Wolfowitz, J., “On the theory of runs with some applications to quality 
control,” Annals of Mathematical Statistics, 14 (1943), 280-88. 

(11) Olmstead, P..S., “Distribution of sample arrangements for runs up and 
down,” Annals of Mathematical Statistics, 17 (1946), 24-33. 

(12] Wald, A., Sequential Analysis, New York, John Wiley and Son, 1947. 

[13] Grant, E. L., Statistical Quality Control. New York: McGraw-Hill, 1946. 

(14) Dudding, B. P., and Jennett, W. J., Quality Control Charts, British Stand- 
ards, 600R, 1942. 

[15] Feller, W., Probability Theory and its Applications, Vol. I, Chapter 13, 
New York: John Wiley and Son, 1950. 











TRUNCATED POISSON DISTRIBUTIONS 


Paut R. River 
Wright-Patterson Air Force Base and Washington University 


This paper gives a method of estimating the parameter of 
a Poisson distribution which has been truncated at the lower 
end. Application is made to a number of actual examples. 


INTRODUCTION 


ANY studies have been made of truncated distributions. (See [2] 
M and the references contained therein.) Of the continuous type, 
the normal distribution and the Pearson system of distributions have 
been rather thoroughly investigated. Of discrete distributions, the bi- 
nomial has been studied by Finney [8]. 

Yule [6] has considered an interesting type of distribution which he 
met in studying vocabulary. This is the number of words occurring 
once, the number occurring twice, and so on, in a specified work of a 
certain author. The distribution is somewhat similar to a truncated 
discrete distribution, in that there is no frequency corresponding to 
the number of words occurring zero times. Obviously there can be no 
frequency corresponding to the zero class unless it can be assumed that 
the total number of words in the author’s vocabulary is known. The 
frequency of the zero class would then be those words in his vocabulary 
which were not used in the particular work under consideration. 

Other examples of truncated discrete distributions can easily be 
thought of. Consider, for example, the distribution of number of traf- 
fic violations. There will be certain persons who have received 1 ticket, 
some who have received 2 tickets, some 3, and so on. There will be no 
record of those who have received no tickets. 

The present paper considers another discrete distribution, the Pois- 
son. 

As is well known, the Poisson probability function is the function 


Dp: = er*/x! (1) 


This gives the limit, as the number of trials approaches infinity but 
the number of expected occurrences \ remains constant, of the proba- 
bility that an event will occur exactly z times, x ranging over the non- 
negative integers. 

The function contains the single parameter A, to which, incidentally, 
each and every semi-invariant of the Poisson distribution is equal. Tip- 
pett [5] and Bliss [1] have considered the question of estimating this 


826 








)is- 


(1) 


ut 





TRUNCATED POISSON DISTRIBUTIONS 827 


parameter when the frequencies of those classes corresponding to val- 
ues of x above a certain specified value have been pooled. Fisher and 
Yates [4], p. 1, have shown that for an even number of degrees of free- 
dom, the probability of exceeding a given value of x? is reducible to a 
partial sum of a Poisson series, i.e., a Poisson series with the upper end 
truncated. The present paper gives methods of estimating \ when some 
of the data in a sample are missing, particularly when the lower end is 
truncated. 


ESTIMATING THE PARAMETER FROM TWO CLASS FREQUENCIES 


If a sample is truly Poisson in character, the value of \ can be esti- 
mated even when only two different class frequencies are known. Let 
us designate by f. the frequency with which the value z occurs in the 
sample. Then the expected value of f, is Np.z, where N is the number 
in the sample. If we use the observed frequencies of two different 
classes as estimates of their expected values, we are led to the equation 

fe  mide-™ 


fm iki 





(2) 


which is easily solved for X. 


ESTIMATING THE PARAMETER FROM A TRUNCATED SAMPLE 


We wish now to consider the case in which one or more classes at the 
lower end of the sample are missing. We shall use the following nota- 
tion: 


To - Life T, = Xu rf, T2 = Xu xfs, (3) 


where k is the number of missing classes. Further, let 


k—-1 
To’ = N>> pz + To, 
0 
k—-1 
T;'=N } tpz + Ti, (4) 
0 
k-1 


T:' = N >. xp, + To. 
0 


Then 7,,'/7>’ is an estimate of the mean X, and similarly,7.’/7’ is an 
estimate of the second moment of the distribution about the origin, 
viz., A-+A?2, 














828 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 
We therefore set 
T,’ = Ty’, T.’ = (A+ A*)T’. (5) 


Substituting from (4) and (1), we are led, after some reduction, to the 
following equations: 





r wr Ne» . 
;  @-t (6) 
Ne™nr* 
T2 — (A+ A*2)T. = ——_ (F +d). 7 
QO +e = GF +) (7 
Solving these simultaneous equations for A, we get 
T: — kT. 
ee creme —_ (8) 
T, — (kK —1)To 


When ) has been estimated from (8), all missing f, can be estimated, 
as can the total frequency. 


EMPIRICAL SAMPLING 


As a test of how good an estimate of \ is provided by (8), samples 
of size 100 were drawn, by using random numbers, from populations of 
10,000, conforming as closely as possible to Poisson distributions. The 
following values of \ were used: 0.5, 1, 2, 3, 4, 5. These samples are 
shown in Table 2. The first column gives the values of xz. The second 
column, headed \=0.5, gives the frequencies, for the respective values 
of x, in the sample drawn from the Poisson population in which the 
parameter \ has the value 0.5; the column headed \=1 gives the fre- 
quencies in the sample drawn from the population in which the param- 
eter has the value 1, and so on. 

We shall denote by X’ the estimate of \ obtained from (8) with k=1, 
and by ’” the estimate of \ obtained from (8) with k=2. Values of }’ 
and ” are recorded in Table 2. 


COMPARISON WITH MAXIMUM LIKELIHOOD ESTIMATES 


It can be shown that the maximum likelihood estimate of \ is given 
by the solution of the equation 





yp) = ‘ (9) 











TRUNCATED POISSON DISTRIBUTIONS 


829 


where »;“) is the first moment of the truncated sample. In particular, 










































































we have 
nN A(1 — e) 
y\4 = ’ y,"! ( . (10) 
1 —e- 1 — e~* — dre 
TABLE 1 
Xx ny! yy” ny! yy” x mn! y,/" 
0.1 | 1.051 | 2.034 || 0.9 | 1.517 | 2.347 || 1.7] 2.080 | 2.742 
0.2 | 1.103 | 2.069 1.0 | 1.582 | 2.392 || 1.8] 2.156 | 2.797 
0.3 | 1.157 | 2.105 1.1 | 1.649 | 2.488 || 1.9] 2.234 | 2.854 
0.4 | 1.213 | 2.142 1.2 | 1.717 | 2.486 || 2.0] 2.313 | 2.911 
0.5 | 1.271 | 2.181 1.3 | 1.787 | 2.534 || 2.5 | 2.724 | 3.220 
0.6 | 1.330 | 2.221 1.4 | 1.858 | 2.584 || 3.0] 3.157 | 3.560 
0.7 | 1.391 | 2.262 1.5 | 1.931 | 2.635 || 4.0] 4.075 | 4.323 
0.8 | 1.453 | 2.304 1.6 | 2.005 | 2.688 || 5.0 | 5.034 | 5.176 
TABLE 2 
SAMPLES FROM POISSON DISTRIBUTIONS 

zr \=0.5 A=] h=2 A=3 A=4 A=5 
0 54 47 20 6 4 0 

1 32 26 31 7 9 7 

2 12 14 25 32 21 5 

3 2 9 10 14 20 10 

4 4 10 19 17 28 

5 3 15 10 11 

6 1 4 8 16 

7 2 8 14 

s 1 2 6 

9 1 2 
10 1 
PU 0.58 1.34 1.86 3.02 3.70 4.68 
n” 0.38 1.34 1.95 2.93 3.73 4.65 
vy’ 1.35 1.83 2.15 3.30 3.73 4.84 
id 2.14 2.63 2.88 3.48 4.01 5.13 
i’ 0.63 1.36 1.79 2.71 3.62 4.80 
i” 0.40 1.49 1.94 2.89 3.59 4.94 



































830 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


To assist in solving these equations, values were assigned to d, and 
the corresponding values of »;’ and v;’ were calculated. Results are 
shown in Table 1. From this table, for a given value of 1)’ or »,’’, the 
maximum likelihood estimate, }’ or X’’, of the parameter \ can be ob- 
tained by interpolation. The values for the samples obtained in this 
study are exhibited in Table 2. 


CONCLUSION 


As judged by the limited number of samples in this study, the esti- 
mates of \ provided by the suggested method seem in most cases to be 
somewhat better than those provided by the method of maximum like- 
lihood. This is particularly true when only the lowest class is missing. 
Moreover, the method is quite simple and direct, while the method of 
maximum likelihood requires the solution of equation (9) either by 
trial and error of by the use or tables similar to Table 1. 


REFERENCES 


{1] Bliss, C. I., “Estimation of the mean and its error from incomplete Poisson 
distributions,” Connecticut Agricultural Experiment Station Bulletin, 513 
(1948), 12 pp. 

[2] Cohen, A. C., Jr., “Estimation of parameters in truncated Poisson frequency 
distributions,” Annals of Mathematical Statistics, 22 (1951), 255-65. 

[3] Finney, D. J., “The truncated binomial distribution,” Annals of Eugenics, 
14 (1949), 319-28. 

[4] Fisher, Ronald A., and Yates, Frank, Statistical Tables for Biological, Agri- 
cultural and Medical Research. New York: Hafner Publishing Company, Inc., 
1949, p. 1. 

[5] Tippett, L. H. C., “A modified method of counting particles,” Proceedings of 
the Royal Society of London, Series A, 137 (1932), 434-46. 

[6] Yule, G. Udny, Statistical Study of Literary Vocabulary. Cambridge Uni- 
versity Press, 1944. 


ADDENDUM 


Attention should be called to a paper by F. N. David and N. L. 
Johnson, “The Truncated Poisson,” Biometrics, 8 (1952), 275-85, 
which appeared after my paper was submitted for publication. The 
authors consider the special case k=1 of the estimator which I have 
proposed. They show that it has an efficiency less than 1. The efficiency 
has a minimum value of about 70%, which occurs in the range \=2.5 
to \=3.0, and approaches 100% with increasing X. 

















PERCENTAGE POINTS OF THE INCOMPLETE 
BETA FUNCTION 


Rosert E. CLrarx 
The Pennsylvania State College 


HE table presented here gives to four significant figures the values 
Ts p=P(N, X, a) defined by 


N 
oe a(\)va — p)*+ = 1,(X,N —-X +1), 
reX \T 
for N =10(1)50, X =1(1)N, and a=.005, .010, .025, and .050, where 
I,(X, Y) is Karl Pearson’s incomplete beta function ratio.' Values of 
p for which I,(X, Y) =.005, 010, .025, .05, .10, .25, .50 have been given 
by Thompson? in terms of the arguments »,=2Y and v2=2X. The en- 
tries in the present table were obtained by inverse linear interpolation 
of the logarithms of the accumulated frequencies found in Pearson’s 
Tables of the Incomplete Beta Function and by interpolation with 
Lagrangian coefficients of Thompson’s percentage points of the incom- 
plete beta function. For values of p >.2000 these two methods of inter- 
polation gave results which agreed within two units in the fourth 
significant figure, in spite of the fact that for v.>30 in Thompson’s 
tables double interpolation was employed. Thompson’s tables for 
p <.2000 were worked twice to insure accuracy, and then were accepted 
as accurate. For values of p>.2000 the data were smoothed by taking 
fourth differences, staying within the limits set by these two methods of 
interpolation. The data are therefore felt to be accurate within +1.5 
in the last significant figure. 
Since binomial sums and the incomplete Beta function appear fre- 
quently in statistics the table may be used in a number of problems: 
1. Confidence limits for binomial variates may be obtained directly from the 
table. 
2. The table gives some percentage points for the incomplete Beta function 
which are not given by Thompson.? 
3. The values in the table are the lower percentage points of all order statis- 
tics in samples of size from 10 to 50 inclusive. From these values it is possi- 


ble to obtain the corresponding percentage points of any continuous dis- 
tribution by the method given by Curtiss.* 





1 Tables of the Incomplete Beta Function, edited by Karl Pearson, Biometrika Office, University 
College, London, W.C. 1. 

2 Catherine M. Thompson: “Percentage points of the incomplete beta function,” Biometrika, 32 
(1941) 168-81. 

*J. H. Curtiss: “Convergent sequences of probability distributions,” American Mathematical 
Monthly, 50 (1943) 103-5. 


831 











832 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


4. The .05 column of the table is an extension of a part of the table given by 


Grubbs.‘ 


5. Percentage points of the F distribution for a=.005, .010, .025, .050; 
nm, =2(2)100, n2=2(2)100 for n1+nz—23100 may be obtained from the 
present table by making the transformation® 


F = 


n2(1 — p) 


mp 


where p = P[}(ni1+n2—2), $n2, a]. For example F.o for ni =10, n2=20 is 
2.35 since p= P(14, 10, .05) =.46 


These are only a few of the applications which may be made of the 
table which is presented here. The reader will probably know of others 


TABLE 1 


PERCENTAGE POINTS OF THE INCOMPLETE BETA FUNCTION 
(times 10,000) 











N x 005 -010 -025 -050 N xX -005 -010 -025 -050 
10 1 5.011 10.05 25.29 651.16 7 2085 2349 2767 3152 
2 108.5 155.4 252.1 367.7 8 2725 3024 3489 3909 
3 370.1 475.1 667.4 872.6 9 3448 3778 4281 4727 
4 767.7 932.1 1216 1500 10 4270 4627 5159 5619 
5 1283 1504 1871 2224 11 5230 5605 6152 6613 
6 1909 2183 2624 3035 12 6431 6813 7354 7791 
7 2649 2971 3476 3934 
8 3518 3883 4439 4931 13 1 3.855 7.728 19.46 39.38 
9 4557 4956 5550 6058 2 82.52 118.2 192.1 280.5 
10 5887 6310 6915 7411 3 278.3 357.8 503.8 660.5 
4 570.8 694.6 909.2 1127 
11 1 4.556 9.133 22.99 46.52 5 942.3 1108 1386 1657 
2 98.20 140.7 228.3 333.2 6 1383 1588 1922 2239 
3 333.4 428.2 602.2 788.2 7 1887 2129 2513 2870 
4 688.4 836.6 1093 1351 8 2454 2729 3158 3548 
5 1145 1344 1675 1996 9 3087 3391 3857 4274 
6 1693 1940 2338 2712 10 3794 4122 4619 5054 
7 2332 2622 3079 3498 11 4590 4939 5455 5899 
8 3067 3396 3903 4356 12 5510 5872 6397 6837 
9 3915 4277 4822 5299 13 6653 7017 7530 7942 
10 4914 5302 5872 6356 
11 6178 6579 7151 7616 14 1 3.580 7.176 18.07 36.57 
2 76.42 109.5 178.0 260.0 
12 1 4.176 8.372 21.08 42.65 3 257.1 330.6 465.8 611.0 
2 89.68 128.5 208.6 304.6 4 525.9 640.3 838.9 1041 
3 303.4 389.8 548.6 718.7 5 866.0 1019 1276 1527 
4 624.0 759.0 992.5 1229 6 1267 1457 1766 2061 
5 1034 1215 1517 1810 7 1724 1947 2304 2636 
6 1522 1746 2109 2453 8 2234 2488 2886 3250 





4 Frank E. Grubbs, “On designing single sampling inspection plans,” Annals of Mathematical Sta- 
tistics, 20 (1949), 242-56. 
§ Maxine Merrington and Catherine M. Thompson: “Tables of percentage points of the inverted 
beta (F) distribution,” Biometrika, 33 (1943), 73-88; and C. J. Burke: “Computation of the levels of 


significance in the F test,” Psychological Bulletin. 48 (1951) 392-97. 








INCOMPLETE BETA FUNCTIONS 833 











953 
TABLE 1—(cont.) 
by 
- yn X  .005 010 .025 050 N X 005 010 .025 050 
’ 
he 9 2799 3080 3514 3904 16 © 6370«/«6684=— 71317499 
10 3421 3726 «©4190-4600 17 -7322,—S« 7627S 80478384 
11 = 4108S 4433—« «4920-5343 
12 4877 «=«—«5217,——s«B719——s«C 4G 18 1 2.784 5.582 14.06 28.46 
13-5760, «6109S 6137083 2 58.99 84.57 137.5 201.1 
14 = 6849——s«s7107/—«s«7684=— (8074 3 197.0 253.6 357.9 470.2 
is 4 400.2 488.0 640.9 796.9 
150 i(‘<aLMSsi“‘é‘<« 68 | 16.86 | 84.14 5 = 654.4 771.9 969.5 1164 
2 71.17 102.0 165.8 242.3 6 950.7 1006 1334 1568 
he 3 238.9 307.2 433.1 568.5 7 1284 145417301989 
4 487.6 593.9 778.7 966.6 8 165018442153 2440 
rs 5 801.1 943.6 1182 1417 9 2046 «= 2263 2602 2912 
6 1170 ~=«1346Ss«1634~=—«1909 10 ©2474 «= 2710» 3076 = (3406 
7 1587 1795 2127) 2437 ae 2982 3186 3575 3922 
8 2051 2287 2659 3000 12 3421 3691 4099 4460 
N 9 2561 2823 3229 3596 13 3945 4228 4652 5022 
= 12 4395—«—is«AT1S—— BIDS (5:G02 16 = «5783 8088 65296897 
) 13 5137-5468 = «5954 = 6366 17 6537 6840 7271 7623 
iad 14 5984 6321 6805 7206 18 7450 7743 $147 8467 
2 15 7024 «= 7356 = 7820-8190 
. 19 «=1 = - 2.688 «5.288 =—«13.32 26.96 
: 16 «= sAs«S.132 6.280) 15.81 82.01 : pre ones oo : pe 
; : a oy ar 4 377.7 460.6 605.2 752.9 
, 4 454.5 553.8 726.6 902.5 ; a ——- a ae 
; = = = a 1801888 1020 1875 
6 1086 1251 1520 1 
, 7 1471 «= '1665 19752267 . . a. a a 
, 8 1807 2117 2465 2786 
: 9 2362 2607 «2988-3334 a a Ba me 
' sons Ss = = 123191 3447 38364181 
1k «3415 35701 41344517 3 4th: ti«CTO 
; sm a cs ms 14-4182 4462 «48805242 
so oe om os 15 «4729S 50184445809 
. 14 53725605 C165 «562 ‘ose 6 (lems 
‘ 15 «6186. «6512, «6977-7360 So - «- #- 
‘ 16 = 7181. «7499S 7941 8298 18 essiéatsi‘ié‘éOT?:OC*«*CTTORG 
19 7567 7848 8235 8541 
17 -1—s«2.948 5.910 14.88 30.13 
} 2 62.56 89.67 145.8 213.2 20 1 2.506 5.024 12.65 25.61 
; 3 209.2 260.2 «379.9 499.0 2 52.95 75.92 123.5 180.7 
) 4 425.6 518.8 681.1 846.4 3 176.4 «227.1 320.7 421.7 
5 697.0 821.7 1031 1238 4 357.6 436.2 573.3 713.5 
6 1014 1168 1421 1664 5 583.3 688.4 865.7 1041 
7 1371 1552 1844 2119 6 845.5 975.4 1189 1395 
; 8 1764 1971 2208 2601 7 139 1292 ©1539 «1778 
) 9 2193 2423 2781 3108 8 1460 1634 1912 2171 
* 10 «2656 «2006 «= 32083640 9 1806 2001 2306 2586 
11 = 3154-3423, 38334197 10 ©2177, S390») 27203020 
= 12 3690-3075 44044781 11 =—s-:2572 2801S 31523469 
13 4268 «= 4566 «= 50105395 12-2001 3234 = 36053956 
- 14 + 4806S s«5204 «= «5657 (044 13-3434. 3691 «= «4078 = 4420 
of 15 «5587 «= «B01. « «63566738 14 3004S 4171S 4572 4922 














834 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1033 
TABLE 1—(cont.) 





N x -005 -010 025 -050 N x -005 -010 025 -050 








15 4402 4679 5090 5444 5 501.7 592.5 746.0 898.1 
16 4934 5217 5634 5990 6 725.3 837.5 1023 1202 
17 5505 5793 6211 6563 7 974.3 1107 1321 1525 
18 6129 6417 6830 7174 8 1246 1397 1638 1863 





19 6829 7112 7513 7839 9 1537 1705 1971 2216 
20 7673 7943 8316 8609 10 1848 2031 2319 2582 
11 2176 2374 2682 2961 
21 1 2.387 4.785 12.05 24.40 12 2521 2733 3059 3352 
2 50.37. 72.22 «117.5 = 171.9 13 2884 3108 3450 3754 
3 167.7 215.9 304.9 401.0 14 3264 3499 3854 4169 
4 339.5 414.2 644.6 678.1 15 3662 3906 4274 4596 
5 553.3 653.2 821.8 988.5 16 4079 4331 4708 5036 
6 801.2 924.7 1128 1324 17 4517 4776 5160 5490 
7 1078 1224 1459 1682 18 4978 5242 56306 5961 
8 1381 1546 1811 2057 19 5466 5733 6123 6451 
9 1707 1891 2182 2450 20 5988 6256 6641 6964 
10 2055 2257 2571 2858 21 6554 6819 7196 7507 
ll 2425 2642 2978 3281 22 7186 7443 7805 8098 
12 2815 3047 2402 3719 23 7942 8185 8518 8779 
13 3228 3472 3844 4172 
14 3663 3819 4303 4641 24 1 2.088 4.187 10.54 21.35 
15 4122 4387 4783 5126 2 43.95 63.03 102.6 150.1 
16 4608 4880 5283 5630 3 145.9 187.9 265.6 349.5 
17 5124 5402 5809 6156 4 204.7 359.8 473.5 590.1 
18 5678 5959 6366 6708 5 479.3 566.2 713.2 858.8 
19 6282 6561 6962 7294 6 692.5 799.9 977.3 1149 
20 6957 7232 7618 7933 7 929.7 1056 1262 1457 
21 7770 8031 8389 8671 8 1188 1332 1563 1780 
9 1465 1626 1880 2116 
22 1 2.278 4.567 11.50 23.29 10 1759 1935 2211 2464 
2 48.03 68.88 112.0 104.0 11 2070 2260 2555 2824 
3 159.7 205.7 290.6 382.2 12 2396 2599 2912 3194 
4 323.1 394.3 518.7 646.0 13 2738 2953 3282 3576 
5 526.2 621.4 782.1 941.1 14 3096 3322 3664 3968 
6 761.3 878.9 1073 1260 15 3470 3705 4059 4371 
7 1024 11€2 1386 1599 16 3860 4104 4468 4736 
8 1310 1468 1720 1956 17 4268 4519 4891 5213 
9 1618 1793 2071 2327 18 4696 4952 5329 5653 
10 1946 2138 2439 2713 19 5145 5405 5785 6109 
li 2293 2501 2822 3113 20 5621 5882 6262 6582 
12 2660 2881 3221 3526 21 6127 6538 6764 7078 
13 3046 3280 3636 3952 22 6676 6934 7300 7602 
14 3451 3696 4066 4391 23 7287 7539 7887 8171 
15 3877 4132 4513 4846 24 8019 8254 8575 8827 
16 4326 4588 4978 5315 
17 4799 5068 5463 5802 25 1 2.005 4.019 10.12 20.50 
18 5301 5574 5972 6309 2 42.16 60.46 98.39 144.0 
19 5839 6113 6509 6841 3 139.9 180.2 254.7 335.2 
20 6423 6695 7084 7405 4 282.3 344.7 453.8 565.6 
21 7076 7342 7716 8019 5 458.9 6542.2 683.1 822.9 
22 7860 8111 8456 8727 6 662.6 765.5 935.6 1101 
7 888.9 1010 1207 1395 
23 1 2.179 4.369 11.00 22.28 8 1135 1273 1495 1703 
2 45.90 65.82 107.1 156.7 9 1399 1553 1797 2024 
3 152.5 196.4 277.5 365.2 10 1679 1848 2113 2356 
4 308.3 376.3 495.1 616.8 11 1974 2156 2440 2699 














898.1 
1202 
1525 
1863 
2216 
2582 
2961 
3352 
3754 
4169 





POSS RS ine sees ecw 








INCOMPLETE BETA FUNCTIONS 
TABLE 1—(cont.) 


835 








y X  .005 .010 .025 050 N X  .005 010 .025 050 
12 «2284S «2479 = 2780 = 3051 15 3002S «3213.«S 35333816 
13 © 2607,-—='s«2814 = 3131344 16 =. 3331.«S 3550 = 38804171 
14 «2046S 316334933786 17 +: 3672-S«« 3808 «= «4237 «4534 
15 3208 «= «3525 «= 38674168 18 4025 4257 «= «4604 «4905 
16 3665 «= «390042524561 19 4392 4630 4982-5286 
17 4048 «= 4289 4651 (4964 20 4774 «8016. «S«5372—S«5677 
18 4447 4604 «50625378 21 ««5173.—ss«BAI7-—Ss«5774=— 6079 
19 4864 5116 5487 5805 22 «= «B589s«83S C1924 
20 5302 5557 5930 6246 23 «© 6027,-«Ss«273«Ss«627 Ss 6924 
21 5765 6021 6392 6704 24 «©6493 «6736 «= 7084 «7378 
22 «257s 512——s«87B=OC7183 25 «= 6906S s«7234=S ss 7571=— 7847 
23 «©=-6700-«s—«s7041.=Ss«7397 = 7600 26 ©7554 «= «7783)=S«8103 «(8360 
24 «= 7382—«—s«=7625—=—i«7965;~—=C(‘é BD 27 = 8218. «8432S 8723 «8950 
25 8090 «8318 «= 8628 «8871 

2 «= 1s«.790 = 8.589 = 9.088 ~— «18.30 

2% 1 1.9298 3.865 9.733 19.71 2 37.57 53.88 87.70 128.4 
2 40.51 58.10 94.55 138.4 3 124.4 160.8 226.7 298.5 
3 134.3 173.0 244.6 322.0 4 250.7 306.2 403.4 503.1 
4 271.0 330.8 9435.6 543.1 5 406.8 480.9 606.4 731.1 
5 440.1 520.1 655.5 789.8 6 586.5 678.0 829.6 976.9 
6 635.1 733.9 897.4 1056 7 785.6 893.5 1069 1237 
7 851.6 960.8 1157 1338 8 1002 i125 1322 1509 
8 1087 1220 1433 1633 9 1232 1370 1588 1791 
9 1338 1487 1721 1940 10 1477 «1627 =S «1864 = 2082 
10 = 16051768 «= 2023S 2257 11 «1733 «1806. «21502383 
11 =—«-'1887, 206223352584 12 2002,-'ss«2176=— 2446 «2691 
12 «2181-2369 2659S 202 1322822466 «= 2751 «=: 3007 
13-2489 2688) «20933266 14 —«-:2572,-s«-2767 «= 30658331 
14 = 2809-8018 «S 8337S (3621 15 2874 «= 3078 «= 8387 = (3662 
15 «3143S: 3361=S 3602S 3084 16 ©3186. «= 3398 «= 3718 += 4000 
16 3490S 3716 = 40574357 17-3510 3729S 4058 = 4346 
17-3850 4084 = 44334739 18 ©3845 «= «4070 «=: 4407 = 4700 
18 42254465) 48215130 19 4192 4422« «4765 = 5062 
19 4615 4860 5222 5532 20 4552 «4786 = 5134 = 5433 
20 6023 5271 5636 5946 21 «4925 «= «5163S s«5513S« 813 
21 «5450S 5700» 6065374 22 «53135554 «5905 = 6208 
22 «© 5900s«G1S1 = «65136818 23 «= -B720.-s«B961=Ss 11S 6606 
23 «= «63796628 = 60857281 24 «= «6147,—««s«6387 = «6734 = 7028 
24 «= 680671417487) 7771 25 «= 6601.«««6838.—=s7177~=S 7459 
25 7471 7707 8036 8301 26 ©7089 «©7321«S 7650S 7918 
26 «© 8156 «Ss 8377 «s«8677-—=S «8912 27° ««-7631.=S«s«7854= «81658415 

28 «= 8276 «=: 8483 «8766 = 8985 

27 «1s«i1,856 3.722 «9.373 «18.98 
2 38.99 55.91 91.00 133.2 29 #1 1.728 3.465 8.726 17.67 
3 129.2 166.4 += 235.3 309.8 2 36.25 52.00 84.64 123.9 
4 260.4 318.0 418.9 522.3 3 120.0 «154.6 218.6 288.0 
5 422.8 499.7 630.0 759.3 4 241.7 295.2 389.0 485.2 
6 609.8 704.9 862.2 1015 5 302.0 463.4 584.6 704.9 

7 817.3 920.3 1111 1285 6 564.9 653.2 799.4 941.6 
8 1042 1170 1375 1568 7 756.4 860.4 1030 1192 
9 1283 1426 1652 1862 8 963.9 1083 1273 1453 
10 1588S «160519402168 9 1185 1318 1528 1725 
11 ©1807 Ss «:1976 += 2239S 2479 10 = 1420S 1565 «1794 = 2005 
12 2088S 2268 «= 2548 (801 11 1666 ~=— «1823 «2069 «2298 
13-2381 S572 «28673131 12 1923-2001 «23522589 
14 2686. « 2887 «= 3195-3470 13-2191 «2869 «2645 2893 











836 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 
TABLE 1—(cont.) 








N x 005 010 -025 .050 N x -005 -010 -025 -050 
14 2469 2657 2945 3203 9 1102 1225 1422 1606 
15 2757 2953 3253 3520 10 1318 1454 1668 1866 
16 3055 3259 3569 3844 11 1546 1693 1923 2134 
1? 3362 3574 3894 4175 12 1783 1940 2185 2408 
18 3680 3899 4226 4512 13 2029 2197 2455 2688 
19 4009 4233 4567 4857 14 2285 2461 2732 2975 
20 4349 4578 4917 5210 15 2549 2733 3015 3267 
21 4701 4934 5276 5571 16 2821 3014 3306 3566 
22 5067 5302 5646 5941 17 3102 3302 3603 3870 
23 5447 5683 6028 6320 18 3392 3598 3908 4180 
24 5843 6080 6423 6711 19 3691 3902 4219 4496 
25 6260 6495 6834 7116 20 3998 4214 4537 4818 
26 6702 6934 7265 7539 21 4315 4536 4863 5146 
27 7177 7404 7723 7985 22 4642 4866 5196 5481 
28 7704 7921 8224 8466 23 4980 5206 5539 5823 
29 8330 8532 8806 9019 24 5329 5557 5891 6174 

25 5692 5920 6253 6534 

30 1 1.671 3.349 8.436 17.08 26 6070 6298 6627 6904 

2 35.03 50.24 81.78 119.8 27 6467 6693 7017 7286 
3 115.9 149.3 211.2 278.2 28 6887 7109 7425 7685 
4 233.3 285.0 375.5 468.5 29 7338 7554 7858 8105 
5 378.2 447.2 564.2 680.5 30 7837 8043 8329 8559 
6 544.8 630.1 771.4 908.8 31 8429 8620 8878 9079 
7 729.2 829.7 993.4 1150 
8 928.9 1044 1228 1402 32 1 1.566 3.140 7.909 16.02 
9 1142 1270 1473 1663 2 32.81 47.06 76.61 112.2 
10 1367 1508 1729 1933 3 108.4 139.7 197.7 260.4 
11 1604 1755 1993 2211 4 218.2 «266.5 351.3 438.5 
12 1850 2013 2266 2495 5 353.4 418.0 527.5 636.5 
13 2107 2280 2546 2787 6 508.7 588.4 720.8 849.6 
14 2373 2555 2834 3085 7 680.4 774.4 927.7 1074 
15 2648 2839 3129 3389 8 866.1 973.4 1146 1309 
16 2933 3131 3433 3699 9 1064 1184 1375 1553 
17 3227 3432 3743 4016 10 1273 1404 1612 1804 
18 3530 3742 4060 4339 11 1492 1634 1857 2062 
19 3843 4060 4386 4669 12 1720 1873 2110 2326 
20 4166 4388 4719 5005 13 1957 2119 2370 2597 
21 4499 4726 5061 5349 14 2203 2374 2636 2873 
22 4844 5074 5411 5701 15 2456 2636 2909 3154 
23 5201 5433 5772 6061 16 2718 2905 3189 3441 
24 5573 5805 6144 6430 17 2988 3181 3474 3733 
25 5960 6192 6528 6810 18 3265 3465 3766 4031 
26 6366 6597 6928 7204 19 3551 3756 4064 4335 
27 6797 7024 7347 7614 20 3844 4055 4369 4644 
28 7260 7481 7793 8047 21 4146 4361 4681 4958 
29 7773 7985 8278 8514 22 4457 4676 4999 5279 
30 8381 8577 8843 9050 23 4778 4999 5325 5606 
24 5109 5332 5660 5940 
31 1 1.617 3.242 8.164 16.53 25 5451 5675 6003 6281 
2 33.88 48.60 79.11 115.8 26 5805 6030 6356 6631 
3 112.1 144.4 204.2 269.0 27 6175 6399 6721 6992 
4 225.5 275.4 363.0 453.0 28 6562 6784 7101 7364 
5 365.4 432.1 545.2 657.8 29 6972 7189 7498 7752 
6 526.1 608.5 745.2 878.2 30 7412 7623 7919 8161 
7 704.0 801.1 959.4 1111 31 7898 8099 8378 8602 
8 896.4 1007 1186 1354 32 8474 8660 8911 9106 














_ INCOMPLETE BETA FUNCTIONS 837 


TABLE 1—(cont.) 











0 x .005 010 «= «.025——s«w 050 Nn XxX .005 .010 .025 .050 
6 33 1 1.519 3.045 7.660 15.53 23 4422 © 4634. 's«4048=—Ss«*5 218 
6 2 31.80 45.61 74.26 108.8 24 4722 4936 «= «5 253s« 524 
4 3 105.1 135.4 191.5 252.4 5 5030 «45246 «= s«5564 «5886 
8 4 211.3 258.2 340.3 424.8 26 5348 5565 5883 6154 
8 5 342.2 404.7 510.9 616.6 27 5676 ©5804. s«6210—Ss«6479 
6 492.3 569.6 697.9 822.8 28 «© «6015.—(—«28BCti«éAT:—Ss«CSBIZ 
7 7 658.3 749.4 898.0 1040 29 6360 «6584. s«804—Ss«7154 
6 8 837.8 941.7 1109 1268 30 6738 6951 7255 7507 
0 9 1029 1145 = «1330S s«1508 31 7129 7337 7632 7875 
0 10 1231 1358 1559 1746 32 7548 7749 8032 8262 
6 il 1442-1580) s«:1798—Ss«1995 33 g009 8202 8467 8679 
3 12 1662 1810 2040 2250 34 8557 8733 «= 8972—Ss«)57 
J 13 1800 2047 2201 2611 
4 2127 ©2203 s«24B S278 35 1 1.482 2.871 7.231 14.64 
15 9371 2545 2811 3049 2 20.96 42.97 69.97 102.5 
16 2623 2804 3080 3326 3 98.91 127.6 180.4 237.7 
17 2881 3070 3355 3608 4 198.8 242.9 320.3 399.9 
| 18 3147 3342S 3636=S 3894 56 321.7 380.6 480.6 580.2 
| 19 3419 3621 3922 4185 6 462.7 535.4 656.2 773.9 
! 20 3702 3907 4214 4482 7 618.3 704.0 844.1 978.3 
21 3900 4200 4518 4784 8 786.4 884.3 1042 1191 
22 4287 4501 «© 4818 «= 5092 9 965.3 1075 1249 1412 
23 4503 4809 «= 5129» 5405 10 1154 «:1274S's«1464~—Ss«1640 
24 4907 «5126S s«5448=—Ss«5 724 ll 1351 1481 1685 1878 
25 6231 5452 5774 6050 12 1556 1696 «=-«:1913- «2112 
26 8566 60-5787 «= 6109S s«6383 13 1769 1918 2147 2356 
27 5913 6134 «Ss «6454 06724 M4 1989 2146 2387 2605 
28 6274 6494 6810 7075 15 2216 2380 2632 «=. 2859 
29 6653 6870 7180 7438 16 2450 2621 2882 93117 
30 7053 «7265 «= 7567 «= 7815 17 2690 2868 3138 3379 
31 7482 7688 #7077 8213 18 2936 3121 «= 3399 -S«—«8646 
32 7955 815284248642 19 3189 3379 3665 3917 
33 8517 8607 8942 9132 20 3448 3643 3935 4192 
21 3714 -3013.'s«42NSi(‘tié ATID 
4 11.474) 2.986 7.444 = (15.07 22 3086 4189 «= «4492—s«4756 
2 30.85 44.25 72.05 105.5 23 4265 4472 4779 5045 
8 101.9 131.3 185.8 244.8 24 4551 4761 65072 5339 
4 204.8 250.3 330.0 412.0 25 4846 5058 «= «5370S s«B 638 
56 331.6 302.3 495.3 597.8 26 5148 © 5361. «Ss«5675 = s«B 942 
6 477.0 552.0 676.4 797.6 a7 5450 5674 5987 6253 
7 687.7 726.0 «870.2 += 1008 28 5780 5905 6306 6570 
8 811.3 912.1 1075 1228 29 6113 6326 6635 6894 
9 996.1 1109 1288 1456 30 6458 6670 6974 7228 
10 1191 1315 1510 1691 31 6820 7020 7326 7573 
11 1305 1529 1739 19382 32 7202 7406 7694 7931 
12 1607 1751 «1975s 2.179 33 7611 7808 8084 8309 
13 1828 1980 2217 2431 34 8062 8250 8509 8715 
4 2056 «46022172465 2688 35 8595 8767 9000 819 
15 2291 2460 2719 2951 
16 2538 2700 2078 #3718 36 1 1.302 «2.701 «7.080 = 14.24 
17 9782 2965 3243 3490 2 20.11 41.76 68.00 99.61 
18 3038 3227 3513 3766 3 96.10 123.9 175.3 231.0 
19 3300 3495 3789 4047 4 193.1 235.9 311.2 388.5 
20 3570 3770 4070 4332 5 312.4 369.6 466.8 563.6 
21 3847 4051 4357 4623 6 449.1 519.8 637.2 751.6 
22 41381 4330 4649 «©4918 7 600.1 683.3 819.4 949.9 














838 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 
TABLE 1—(cont.) 








N x -005 -010 025 -050 N x -005 -010 025 -050 
8 763.0 858.1 1012 1157 27 5077 5284 5588 5859 
9 963.3 1043 1212 1371 28 5368 5576 5880 6140 

10 1119 1236 1420 1591 29 5667 5876 6179 6436 
11 1310 1436 1635 1818 30 5975 6183 6485 6739 
12 1509 1644 1856 2049 31 6294 6501 6799 7048 
13 1714 1859 2082 2285 32 6626 6830 7123 7367 
14 1927 2079 2314 2526 33 6972 7172 7548 7695 
15 2146 2306 2551 2772 34 7337 7532 7809 8036 
16 2371 2539 2793 3022 35 7727 7916 8181 8395 
17 2603 2777 3040 3275 36 8158 8336 8584 8781 
18 2841 3021 3292 3533 37 8666 8830 9151 9222 
19 3085 3270 3549 3795 
20 3334 3524 3810 4061 38 1 1.319 2.644 6.660 13.49 
21 3590 3784 4076 4331 2 27.56 39.54 64.39 94.32 
22 3851 4050 4347 4605 3 90.93 117.2 165.9 218.6 
23 4119 4322 4622 4883 4 182.6 223.1 294.3 367.6 
24 4393 4599 4903 5166 5 295.3 349.4 441.4 533.1 
25 4675 4883 5190 5454 6 424.3 491.2 602.3 710.7 
26 4964 5174 5482 5746 7 566.6 645.4 774.3 9897.9 
27 5260 5471 5780 6043 8 720.1 810.1 955.4 1093 
28 5566 5777 6085 6347 9 883.3 983.8 1144 1295 
29 5880 6091 6398 6657 10 1055 1166 1340 1503 
30 6206 6416 6719 6973 11 1235 1354 1542 1716 
31 6544 6752 7050 7299 12 1421 1550 1750 1934 
32 6898 7102 7394 7636 13 1615 1751 1963 2156 
33 7271 7471 7753 7985 14 1814 1958 2181 2383 
34 7671 7863 8134 8353 15 2019 2171 2404 2614 
35 Sill 8294 8547 8749 16 2230 2389 2631 3848 
36 8631 8799 9026 9202 17 2447 2612 2862 3087 
18 2669 2840 3098 3329 

37 1 1.355 2.716 6.840 13.85 19 2896 3072 3338 3574 
2 28.32 40.62 66.15 96.89 20 3128 3309 3582 3822 
3 93.44 120.4 170.4 224.6 21 3365 3551 3830 4075 
4 187.7 229.4 302.5 377.8 22 3608 3798 4083 4331 
5 303.6 359.2 453.7 547.9 23 3857 4050 4339 4591 
6 436.3 505.1 619.3 730.6 24 4110 4307 4599 4854 
7 582.9 663.8 796.2 923.2 25 4370 4569 4865 5121 
8 740.9 833.4 982.7 1124 26 4635 4837 5135 5392 
9 909.1 1012 1177 1332 27 4906 5110 5410 5667 

10 1086 1199 1379 1546 28 5185 5390 5690 5947 
11 1271 1394 1587 1765 29 5470 5676 5976 6231 
12 1464 1595 1801 1990 30 5764 5970 6269 6521 
13 1663 1803 2021 2219 31 6067 6271 6568 6817 
14 1869 2017 2246 2453 32 6379 6582 6875 7120 
15 2081 2236 2475 2691 33 6703 6904 7192 7431 
16 2299 2461 2710 2933 34 7042 7239 7520 7751 
17 2522 2691 2949 3178 35 7399 7591 7862 8084 
18 2752 2927 3192 3428 36 7781 7966 8225 8435 
19 2987 3168 2440 3681 37 8202 8377 8619 8811 
20 3227 3413 3692 3938 38 8699 8859 9075 9242 
21 3474 3664 3949 4199 

22 3726 3920 4210 4464 39 1 1.285 2.577 6.490 13.14 
23 3983 4181 4476 4732 2 26.85 38.51 62.72 91.88 
24 4247 4448 4746 2005 3 88.54 114.1 161.5 212.9 
25 4517 4721 5022 5282 4 177.8 217.3 286.6 358.0 
26 4793 4999 5302 5563 5 287.4 340.1 429.7 519.0 











INCOMPLETE BETA FUNCTIONS 839 




















; 1963 
TABLE 1—(cont.) 

050 vy X 005 .010 .025  .050 N X  .005 .010 025 050 
89 6 412.9 478.0 586.2 601.9 22 = 3306.«S «3577 = 8850 = 4088 
8140 7 551.3 628.0 753.5 874.0 23 «= 3627S «3812 4090-4332 
5436 § 700.5 788.1 929.6 1064 24 = 3863«S «4052 4333°= «4578 
5739 9 859.0 956.9 1113 1260 25 «= 4104« «4206 «= 4581 = 4828 
7048 10 ©—:1026.—S «1133. :1304 = 1462 26 «= 43504344 « 4832 «(5081 
7367 11 =: 1200-Ss«1317,— «15001669 27 = 4600-4797 «= 5087-5337 
7695 12 :1381~S «1506 = 17021881 28 © 4857S « 50555347597 
3036 13 = 1569) «1702,- 1909-2097 29 «5119S «5319 6115861 
395 141762, 1908. 2121-2317 30 © 5388«= «5588 58806129 
781 15 =: 1961S 2109-23375 31 5663 «= 5863S «1556402 
999 16 2165 2320-2557 2769 32-5046 6146 = 64356680 

17 = 2375S 2536 = 2781 = 3000 33 «6237-6436 = 67226063 
49 18 =. 2590-2757 = 3010 3.235 34 © 6537_-—« «6734 = 7017-7253 
39 19 = -2810 2982 32423473 35 = 6849—««7043-« 73207550 
8.6 20 3034 3211 3478 ©3714 36 = 7174.«S «7364 «= 7634 «= 7856 
76 21 8264. 3446 = 37183958 37 «7516S 7701S 79618174 
31 22 3499-3685 3963 «(4206 38 —-7882-—««s«8060)= 8308 = 8509 
0.7 23 «= 3738 = 3928S 42M 4457 39 = 8285-8453 86848868 
7.9 24 «= 3982s «4175—(iiG ATI 40 8759 8913 9119 9278 
093 2 4232 «= 4428« 4718 = (4970 
295 26 46 4487-4686 49795282 41 11,223 2.451 6.1738 12.50 
503 27 47484049 52435497 2 25.52 36.62 59.63 87.36 
716 28 48= «8015-27135 766 3 84.13 108.4 153.5 202.4 
934 29 «= 5289S 5491 «5787 = 6040 4 168.8 206.4 272.3 340.2 
156 30 © 5569S «57726087 ~=—«63118 5 272.8 322.9 + 408.1 493.0 
383 31 5857 «60606354 «6602 6 301.8 453.7 556.6 657.0 
814 32154 635560476892 7 522.9 595.8 715.2 829.8 
848 336460 6660 69477188 8 664.2 747.5 882.1 1010 
087 34 © «6778.«S «697572577492 9 814.3 907.3 1056 1196 
299 35 711073037578 =—(7805 10 972.1 «1074S :1236—1387 
574 3607459 7647)= 7913S 8130 11 s37,—s«:1248-s1422 1583 
309 37 -7833.= «8014S 82688473 12-1308. —S «42716131784 
075 38 = 8245 8416 = 8652884 13 «148516111808 1988 
331 39 © 8730S «8886 «(9098 = (9261 14 1667S 180120082196 
91 15 1855 199522122408 
354 40 «11,253 2.512 6.327 12.81 16 © - 2047S «2194 = 24202623 
21 2 26.17 «37.54 61.14 89.57 17 2244-2397 26322841 
92 3 86.28 «11.2 157.4 207.5 18 ©2447-2004 28473062 
67 4 173.2 211.7 279.3 348.8 19 26522816 = 3066 = 8287 
47 5 279.9 331.3 418.6 505.7 20 2863 «= 3032, 3288 = 3514 
31 6 402.1 465.5 571.0 674.0 21 © 3079S «3252s B14 8744 
1 7 536.7 611.5 733.8 = 851.3 223208 «= 3476 = 87433977 
17 8 681.9 767.3 905.2 1036 23 «3522, 3703S «3976 = 4214 
20 9 836.1 931.4 1084 1227 24 «3750 3935) 42124453 
31 10 998.3 1108. :12691424 25 «= 3983s 41744514604 
51 11 = 11681281, 14601625 26 «= 42204411 4694 = 4939 
34 12 1344S :1466 16561831 27 4462 4655 49415187 
35 13 1526 165518572041 28 © 47094904 51925438 
1 14-1713, «185020632255 29 © 4962,s«5158— 447) 5603 
‘9 15 = 1906. 2051-2273 2473 30 5219 5417 5706 + 5952 

16-2104 S256) 2486 2604 31-8483. 568159706215 
14 17 2307-2465 2704 «2918 32 «5754 S951 = 62396482 
38 18 2516 2679 ©2927 «3146 33 © 6031.—« «6228 «= «6513754 
9 19 ©2728 «2897, 31513377 34 ««6317—«s«G512_—s« 794 = 7082 


0 20 2946 3119 3380 3611 35 6611 6805 7083 7315 
0 21 3169 3346 3613 3848 36 6917 7108 7380 7605 














840 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 
TABLE 1—(cont.) 








N X  .005 .010 .025 .050 N X  .005 010 .025 060 
37 «-7236«S«7492)=—s«7687~=—s(7905 8 631.5 710.8 839.1 960.9 
38 7571 7752 8008 8216 9 773.9 862.5 1004 1138 
39 7929 «= 8104 «Ss 83478548 10 923.6 1021 1176 1320 
40 8323 8488 8715 8894 il 1080 1185 = «13521506 
41 8788 8938 9140 9205 12 1242 1385 s«1533 (1696 

13 1409 ©=—«:1580s«1718 1890 

42 1 1.193 = 2.393) -6.026 12.21 14 1582 1710» 19082088 

2 24.01 35.74 58.20 85.27 15 1759 ©1804 = 2101S (2288 
3 82.09 105.8 149.8 197.5 16 1941 2081 «= 22972492 
4 164.7 201.3 265.6 331.9 17 —«-2127,—Ss«2278=— 2497 = 2698 
5 266.1 315.0 398.1 481.0 18 2318 2469 2701 2907 
6 382.0 442.4 542.8 640.9 19 2513 2669 2908 3120 
7 509.8 581.0 697.4 809.3 20 2711.-=Ss« 2878 = 8118 = 8335 
8 647.4 728.7 860.1 984.7 21 2012 3080 3331 «3553 
9 793.6 884.3 1030 1166 22 «3120-3291. = 8547 = 873 
10 ©947.2«S«:1047—Ss«1205— «1388 23 3331 3505 3766 3996 
ul 1107 ««1216-—(s«1386 S544 24 «= 3545 3723 30884221 
12 1274 1390 1572 1739 25 «= 3763S (si8044 A244 
13 1446 ©=-1570s«1762 «(1938 26 46 «3084S «4168 = 44424679 
14 1624 1754 1957 2141 27 «= 4210S ««4396= 46734912 
15 1806 194321552347 28 40s «4441—i(i‘2D «4008148 
16 1991 2136 2357 2556 29 «©: 4676«S ss: 48660=— «51465387 
17 —«-2183 «2334 = 2563-2768 30 4915 5106 5388 5629 
18 2381 2535 2773 2983 31 5159 5351 5633 5874 
19 2581 2740 2984 3201 32 5408 5601 5883 6123 
20 2785 + «=. 2950s «32003422 33 5663 5856 6137 6375 
21 «2004S «3164 = 3420S 3646 34 «5924 = «116 «= 6306 =—6632 
22 3207 3381 3642 3872 35 6192 6383 6660 4893 
23 © 3424 «Ss 3802-3868 = 4102 36 6467S «665769307150 
24 «3645S 8826 «= 4097 «4333 37. «6751s 698872077431 
25 3870 4054 4329 4568 38 © 7045 «= 7229=S 7492 = 7709 
26 4090 4286 4564 4806 39 7351 7531 7787 7996 
27 ««s«4333,=«i«<“‘«wS DiS 40 7673 7848 8004 8204 
28 4571 4762 5045 5289 41 8018 8185 8419 8607 
29 © 4814. «S«5007-«5202—Ss«5536 42 8396 8554 8771 8944 
30 5062 5257 5542 5786 43 8841 8984 9178 9327 

31 5316 5511 5796 6039 

32 «5575 = «5770S 0556297 44 1 1.139 -2.284 «5.752 11.65 
33 «5841 «6086 = «6319 «6559 2 3.77 34.09 55.53 81.36 
34 6113 6307 6588 6825 3 78.28 100.9 142.9 188.4 
35 6394 6586 6864 7007 4 157.0 191.9 253.3 316.5 
36 6683 6873 7146 7374 5 253.6 300.2 379.4 458.6 
37 «««6982.—«é<“‘z74GO«=«C«C7437~=s«(76858 6 363.9 421.5 6517.3 610.9 
38 7205 7478 7738 7952 7 485.5 553.3 664.4 771.3 
39 7623 7801 8052 8256 8 616.4 693.8 819.2 938.2 
40 7974 8145 8384 8576 9 755.2 841.8 980.4 1111 
41 8360 8521 8743 8919 10 901.2 996.8 = 11471288 
42 «8815s 8962~—S«s«éSD «9312 1 1053 1157'S s«1319=—:1470 
12 1211 1322 1496 1655 
43 1 1.166 © 2.887 «5.886 11.92 13 1375 ©=—-:1492S («1676 = s«1845 
2 24.32 34.90 568.3 83.27 14 1543 1667 +«—-1861 2037 
8 80.14 103.3 146.3 192.8 15 1715 1847-2049 S233 
4 160.7 196.5 259.3 324.0 16 1802 «=: 2080 '=««224kS 2431 
5 259.7 307.4 388.5 469.5 17 «-:2078.-——s«2216 = -2436 = 2632 
6 372.8 431.7 529.8 625.5 18 2259 2406 2634 2836 
7 497.3 566.8 680.5 789.9 19 © - 2448 «Ss 2601 «2835 = (3048 











ER 1953 INCOMPLETE BETA FUNCTIONS 841 
TABLE 1—(cont.) 














ee —_— 
-050 N D .005 .010 .025 050 N xX .005 .010 .025 -050 
ee | 
960.9 20 2641 2799 © 3039 3252 31 4873 5060 5335 5571 
1138 21 2838 3000 3247 3464 32 5105 5293 5569 5804 
1320 22 3038 3205 3457 3678 33 5342 5530 5806 6040 
1506 23 3242 3414 3670 3895 34 5583 5771 6046 6280 
1696 24 3450 3625 3886 4114 35 5829 6018 6291 6523 
1890 25 3662 3839 4104 4335 36 6081 6269 6540 6770 
2088 26 3877 4057 4325 4559 37 6340 6526 40-6794 7021 
2288 27 4095 4279 4550 4786 38 6606 6790 7055 7276 
2492 28 4318 4504 4778 5015 39 6879 7061 6321 7537 
2698 29 4545 4732 5008 5247 40 7162 7341 7595 7805 
2907 30 4776 4965 5242 5481 41 7457 7631 7878 8080 
3120 31 5012 5201 5480 5718 42 7768 7936 8173 8366 
3335 32 5252 5442 5721 5959 43 8099 8260 8485 8666 
3553 33 5497 5688 5966 ©—-_-6 208 44 8462 8614 8823 8989 
3773 34 5748 5939 6216 6451 45 8889 9027 9213 9356 
3996 35 6004 6194 6470 6702 
4221 36 6267 6456 6729 6958 46 1 1.090 2.185 5.502 11.14 
4449 37 6538 6725 6994 7219 2 22.72 32.60 53.10 77.80 
4679 38 6816 7001 7265 7485 3 74.81 96.45 136.6 180.1 
4912 39 7105 7286 7545 7758 4 150.0 183.4 242.0 302.5 
5148 40 7405 7582 7833 8039 5 242.2 286.7 362.5 438.2 
5387 41 7722 7893 8135 8331 6 347.6 402.5 494.1 583.6 
5629 42 8059 8223 8453 8638 7 463.4 528.2 634.5 736.7 
5874 43 8430 8584 8798 9867 8 588.1 662.2 782.0 895.9 
6123 44 8866 9006 9196 9342 9 720.4 803.2 935.7 1061 
6375 10 859.4 950.3 1095 1230 
6632 45 1 1.114 2.233 5.625 11.39 11 1004 1103 1259 1403 
5893 2 23.23 33.30 54.29 79.54 12 1154 1260 1427 1580 
7159 3 76.51 98.63 139.6 184.2 13 1310 1423 1599 1760 
7431 4 153.4 187.5 247.5 309.3 14 1469 1589 1774 1943 
7709 5 247.7 293.38 370.8 448.2 15 1633 1759 1954 2129 
7996 6 355.5 411.8 505.4 596.9 16 1801 1933 2136 2318 
8294 7 474.2 540.5 649.1 753.6 17 1973 2110 2321 2509 
8607 8 601.9 677.6 800.2 916.6 18 2149 2291 2509 2703 
8944 9 737.4 822.0 957.5 1085 19 2328 2475 2700 ©2900 
9327 10 879.8 972.7 1121 1258 20 2511 2663 2894 3098 
ll 1028 1129 1288 1436 21 2697 2854 3090 3299 
11.65 12 1182 1291 1460 1617 22 2887 3048 #3289 3502 
81.36 13 1341 1457 1637 1801 23 3080 3245 3491 3708 
188.4 14 1505 1627 1817 1989 24 3276 3444 3695 3916 
316.5 15 1673 1802 2000 2180 25 3475 3646 3902 4126 
458.6 16 1846 1980 2186 © ©—-2373 26 3678 3852 4111 4338 
610.9 17 2022 2162 2375 2569 27 3884 4061 4323 4552 
771.3 18 2202 2348 2568 2768 28 4093 4272 4538 4768 
938.2 19 2386 2537 2765 2969 29 4306 4487 4756 4987 
1111 20 2574 2729 2964 3173 30 4523 4705 4976 5208 
1288 21 2766 2925 3166 3380 31 4743 4927 5198 5432 
1470 22 2961 3124 3371 3588 32 4967-6152 5424 5658 
1655 23 3159 3326 3578 3799 33 5195 5382 5654 5887 
1845 24 3361 3532 3788 4012 34 5428 5615 5887 6119 
2037 25 3566 3740 4001 4228 35 5666 5852 6123 6354 
2233 26 3775 3952 4216 4446 36 5908 6094 6363 6592 
2431 27 3987 4167 4434 4666 37 6156 6341 6608 6834 
2632 28 4203 4385 4655 4888 38 6410 6594 6858 7080 
2836 29 4423 4607 4878 5113 39 6671 6853 7113 7331 


3043 30 4646 4832 5105 5341 40 6940 7119 7374 7587 














AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10983 
TABLE 1—(cont.) 








005 -010 -025 -005 -010 050 
7218 7393 7643 71.64 92.36 172.5 
7507 7679 7921 143.6 175.5 289.7 
7812 7978 8210 231.7 274.4 419.5 
8137 8295 8516 332.4 385.1 585.6 
8493 8642 8847 443.2 505.3 705.0 
8912 9047 9229 562.3 633.3 857.3 
688.7 767.9 1015 
1.066 2.138 5.385 821.3 908.4 1176 
22.23 31.90 51.96 959.5 1054 1342 
73.19 94.36 133.6 1103 1204 1510 
146.7 179.4 236.8 1251 1359 1683 
236.8 280.4 354.6 1403 1517 1857 
339.8 393.6 483.2 1559 1679 2035 
453.1 516.5 620.5 1719 1845 2215 
574.9 647.4 764.7 1883 2014 2398 
704.2 785.2 914.9 2050 2186 2583 
839.9 928.9 1070 2221 2361 2770 
981.3 1078 1230 2394 2539 2959 
1128 1232 1395 2571 2721 3150 
1280 1390 1562 2751 2904 3344 
1435 1552 1734 2934 3091 3540 
1595 1718 1909 3119 3280 3737 
1759 1888 2087 3307 3472 3936 
1926 2061 2268 3499 3667 4137 
2097 2237 2451 3694 3864 4340 
2272 2417 2637 3891 4065 4546 
2451 2600 2826 4092 4267 4753 
2633 2786 3018 4296 4473 4962 
2818 2974 3212 4503 4682 5174 
3005 3166 3408 4714 4894 5387 
3195 3360 3607 4928 5109 5603 
3389 3557 3808 5145 5327 5821 
3587 3757 4012 5366 5549 6043 
3787 3960 4218 5592 5774 6267 
3990 4166 4427 5822 6004 6493 
4196 4375 4639 6057 6238 6723 
4406 4586 4853 6296 6477 6956 
4620 4801 5069 6542 6721 7193 
4837 5020 5288 6794 6971 7435 
5057 5242 5510 7054 7228 7681 
283 5467 5736 7322 7493 7934 
5512 5696 5965 7602 7768 8194 
5745 5929 6197 7896 8056 8364 
5984 6167 6434 8209 8362 8746 
6228 6410 6674 8552 8695 9049 
6477 6659 6919 8955 9085 9395 
6734 6913 7170 
6998 7174 7426 1 1.023 2.051 10.46 
7271 7444 7690 2 21.32 30.58 73.01 
7556 7724 7962 3 70.15 90.45 168.9 
7855 8018 8246 4 140.6 171.9 283.6 
8173 8329 8546 5 226.9 268.6 410.8 
8523 8669 8871 6 325.4 377.0 546.9 
8934 9067 9245 7 433.7 4094.5 690.2 
8 550.3 619.7 839.2 
1.044 2.094 5.273 9 673.8 751.4 993.1 
21.77 =31.23 = 50.87 10 803.5 888.8 1151 














1953 





INCOMPLETE BETA FUNCTIONS 


TABLE 1—(cont.) 








x -005 010 025 -050 N Xx -005 -010 -025 -050 
11 938.6 1031 1177 1313 6 318.6 369.2 453.4 535.7 
12 1079 1178 1334 1478 7 424.7 484.3 581.9 676.0 
13 1223 1329 1495 1646 8 538.8 606.8 717.0 821.8 
14 1372 1484 1658 1817 9 659.6 735.7 857.6 972.5 
15 1524 1643 1825 1991 10 786.5 870.0 1003 1127 
16 1680 1804 1995 2168 11 918.6 1009 1153 1286 
17 1840 1969 2167 2346 12 1055 1153 1306 1447 
18 2003 2137 2342 2526 13 1197 1301 1463 1612 
19 2169 2308 2520 2709 14 1342 1452 1623 1779 
20 2339 2482 2700 2894 15 1491 1607 1786 1949 
21 2511 2659 2882 3081 16 1645 1765 1952 2121 
22 2686 2838 3067 3270 17 1800 1926 2120 2295 
23 2865 3020 3254 3461 18 1959 2090 2291 2472 
24 3046 3204 3444 3654 19 2122 2257 2465 2651 
25 3229 3391 3635 3848 20 2286 2427 2641 2831 
26 3416 3581 3828 4044 21 2456 2599 2819 3014 
27 3606 3773 4024 4242 22 2626 2775 2999 3199 
28 3798 3968 4222 4442 23 2799 2953 3182 3385 
29 3993 4166 4422 4644 24 2977 3132 3367 3573 
30 4191 4366 4624 4848 25 3156 3314 3554 3763 
31 4393 4569 4829 5054 26 3338 3499 3742 3955 
32 4597 4775 5037 5262 27 3522 3687 3933 4149 
33 4805 4983 5247 5473 28 3709 3877 4126 4344 
34 5015 5195 5459 5685 29 3899 4069 4321 4541 
35 5229 5410 5674 5899 30 4092 4263 4518 4739 
36 5448 5628 5892 6117 31 4288 4461 4718 4939 
37 5670 5850 6113 6337 32 4486 4661 4920 5142 
38 5896 6076 6338 6559 33 4688 4864 5124 5347 
39 6127 6306 6566 6785 34 4893 5070 5331 5554 
40 6363 6541 6798 7014 35 5100 5278 5540 5763 
41 6605 6781 7034 7247 36 5311 5489 5752 5974 
42 6853 7027 7276 7484 37 5526 5705 5967 6188 
43 7108 7279 7523 7726 38 5745 5923 6183 6404 
44 7372 7539 777 7974 39 5967 6145 6404 6622 
45 7647 7810 8040 8229 40 6195 6373 6629 6844 
46 7935 8092 8313 8493 41 6428 6603 6856 7069 
47 8242 8393 8602 8770 42 6665 6839 7089 7298 
48 8580 8721 8915 9068 43 6909 7081 7326 7531 
49 8975 9103 9275 9407 44 7160 7329 7569 7769 
45 7420 7585 7819 8012 
1 1.002 2.010 5.062 10.25 46 7690 7850 8077 8262 
2 20.89 29.97 48.82 71.54 47 7973 8128 8345 8522 
3 68.73 88.61 125.5 165.5 48 8275 8423 8629 8794 
4 137.7 168.4 222.3 277.9 49 8606 8745 8935 9086 
5 222.2 263.1 332.7 402.4 50 8995 9120 9289 9418 

















BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 
AND RELATED TOPICS 


I. RicHarp SAvaGE 
National Bureau of Standards 


This bibliography contains 999 references on nonparametric 
statistics and related topics, classified as follows: (A) Surveys 
and Discussions (39), (B) Theory (31), (C) Tchebycheff In- 
equalities (94), (D) Tolerance Sets (21), (E) Goodness of 
Fit (122), (F) Multisample Problems (53), (G) Parameter 
Problems (135), (H) Contingency Tables (75), (I) Randomness 
(109), (J) Correlation and Curve Fitting (96), (K) Compara- 
tive Studies (49), (L) Systematic Statistics (127), (M) Scaling 
(37), (N) Distribution Theory (383), (O) Applications (89), 
(P) Tables (228), (X) Miscellaneous (28). 


INTRODUCTION 


ONPARAMETRIC statistics has recently become an important special 

field of statistics. Papers related to nonparametric problems were 
published in the nineteenth century, but the true beginning of the sub- 
ject may be taken as 1936, the year in which Hotelling and Pabst pub- 
lished their paper on rank correlation. By 1943 the literature had be- 
come extensive enough to warrant a review article by Scheffé. Now the 
number of papers concerned with nonparametric statistics appearing 
in statistical journals is very large; in fact these articles are taking a 
considerable portion of the available space. Consequently supplements 
to this bibliography will be issued in order to keep it up to date. 

In spite of the abundance of nonparametric literature, there is no 
generally accepted definition of the field. It is not always clear which 
techniques, problems, and theories are nonparametric. In the prepara- 
tion of this bibliography over-inclusion has been deemed better than 
omission of titles that might be of use to those interested in border-line 
aspects of nonparametric statistics. 

Entries in the bibliography are arranged alphabetically by author, 
and chronologically within authors. After each entry one or several let- 
ters appear, indicating the categories in the following list to which the 
entry belongs: 


Surveys and Discussions 
Theory 

Tchebycheff Inequalities 
Tolerance Sets 


Dae 


844 








tric 


yeys 

In- 
3 of 
eter 
ness 
ara- 
ling 
89), 


cial 
ere 
ub- 
ub- 
be- 
the 
ing 


nts 


or, 
et- 
the 











BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 845 


Goodness of Fit 

Multisample Problems 

Parameter Problems 

. Contingency Tables 

Randomness 

Correlation and Curve Fitting 

. Comparative Studies 
Systematic Statistics 

. Scaling 

. Distribution Theory 

Applications 

Tables 

. Miscellaneous 


MMOAZBMP AS Oss 


Following many of the entries appears a sequence of digits of the form 
zy-abe; this means that the entry was reviewed in Mathematical Re- 
views in the year 19zxy on page abc. 


A. Surveys and Discussions 


Since nonparametric statistics has existed as a special field only about 
fifteen years, surveys and discussions of the general subject are scarce. 
Two important papers of this type are by Scheffé [1943b] and Wolfo- 
witz [1949]. These papers give a comprehensive view of the problems 
and results obtained up to their time of publication. Pitman [1948] and 
Hemelrijk et al. [1951] have sets of lecture notes devoted to nonpara- 
metric statistics. Wilks [1948] covered many of the problems of non- 
parametric statistics in his discussion of order statistics. Wallis [1952] 
gave a brief introduction to the subject and its applications. Most of 
the other papers given this classification are specialized and their cross 
classifications will better indicate their content. 


B. Theory 


There does not exist a unified theory of nonparametric statistics, 
but there have appeared theoretical approaches to some of the special- 
ized problems. The structure of critical regions with optimum proper- 
ties was discussed by Feller [1938], Scheffé [1943a], Lehmann and Stein 
[1949], Hoeffding [1951b], and Lehmann [1951]. The use of “maximum 
likelihood” in the nonparametric theory was introduced by Wolfowitz 
[1942]; Levene [1952] made further use of this concept. Hodges and 
Lehmann [1950] gave some nonparametric estimators which are opti- 
mum in terms of minimax theory. 





846 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


C. Tchebycheff Inequalities 


Tchebycheff inequalities are included in the bibliography since they 
allow one to make probability statements when only a small amount 
of a priori information (usually several moments) is given about the 
distributions involved. Fréchet [1950] presented a review which included 
many of the inequalities of the Tchebycheff type. Godwin [1944] gave 
an English summary of the Fréchet material. Guttman [1948b] and 
Midzuno [1950] introduced inequalities using higher sample moments, 
which gave shorter average confidence intervals than the usual in- 
equalities. 


D. Tolerance Sets 


Wilks [1941] gave the first presentation of nonparametric tolerance 
limits. There have been subsequent generalizations of the theory of 
tolerance limits treating multivariate samples, irregularly shaped re- 
gions, discontinuous cases, and sequential cases. A recent paper in this 
field is by Fraser [1953a]. 


E. Goodness of Fit 


A goodness of fit test has the following properties: (1) It is defined 
for samples from some large class of distributions, such as all continuous 
univariate distributions. (2) The null hypothesis is either some specified 
distribution or a class of distributions of which the functional form is 
known. (3) For all null hypotheses the test statistic used has the same 
distribution (at least in the limit). (4) The test is consistent. 

The first goodness of fit test, chi-square, was introduced by K. Pear- 
son {1900]. Since then many new procedures have been presented. Cur- 
rently, there is much interest in the Kolmogorov-Smirnov tests and 
related topics. These have been summarized by Anderson and Darling 
[1952]. Some progress has been made in devising goodness of fit tests 
for multivariate problems; see papers by P. B. Simpson [1951] and 
Rosenblatt [1952a]. There has been little justification for the proposed 
goodness of fit procedures, but Neyman [1937], Mann and Wald [1942], 
and Wolfowitz [1942] gave procedures having optimum properties 
other than consistency. 


F. Multisample Problems 


Multisample problems or multisample goodness of fit problems in- 
volve testing the hypothesis that several samples come from the same 
population. Solutions to these problems should satisfy the following 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 847 


conditions: (1) Under the null hypothesis the test statistic is distribu- 
tion-free (at least in the limit). (2) The procedure is consistent for a 
large class of alternatives. 

An early approach to this problem was K. Pearson’s [1911] two sam- 
ple chi-square test. Since then, many procedures have been introduced. 
Recently, there have been many investigations of tests related to the 
Kolmogorov-Smirnov test; an example is the work of Anderson and 
Darling [1952]. Tests having optimum properties other than consist- 
ency were suggested by Wolfowitz [1942] and Lehmann [1951]. 


G. Parameter Problems 


Parameter problems include estimation and testing procedures deal- 
ing with location and scale parameters. Although these problems in- 
volve parameters, they are nonparametric since (1) the parameters are 
defined for large classes of distributions, and (2) the proposed proce- 
dures lead to probability statements that are distribution-free. 

A typical parameter problem is the testing of the hypothesis that a 
sample comes from a distribution with some specified percentage point. 
This problem has received extensive treatment by K. R. Nair [1940b], 
Steward [1941], Dixon and Mood [1946], Noether [1948, 1951] and 
Walsh [1949c, 1951a]. Testing the hypothesis that two samples come 
from populations with the same median has been examined by many 
authors beginning with the work of Wilcoxon [1945] and summarized 
by Kruskal and Wallis [1952]. Nonparametric analysis of variance has 
been discussed by Pitman [1937c], Friedman [1937], G. W. Brown and 
Mood [1951], and Terry and Bradley [1952c]. 

Optimum procedures have been investigated by Wolfowitz [1942], 
Lehmann and Stein [1949], Hodges and Lehmann [1950], and Hoeffding 
[1951b]. 


H. Contingency Tables 


Contingency tables are the conventional rs tables used to cross- 
classify data. Techniques using contingency tables include the analysis 
of association and tests of goodness of fit. Many of these techniques are 
distribution-free, since they are based on conditional distributions of 
the sample. 

An interesting theoretical treatment of contingency tables was 
made by Fisher [1948]. Papers by E. S. Pearson [1947] and Barnard 
[1947a, 1947b] constitute a survey of the field. Mainland [1948] gave 
extensive applications and tables. 











848 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 
I. Randomness 


In many situations it is desirable to examine the assumption of 
randomness. This involves testing the hypothesis that a sequence of 
observations was made on independently and identically distributed 
random variables. Wald and Wolfowitz [1943] and Levene [1952] pre- 
sented many procedures of this type. 


J. Correlation and Curve Fitting 


Problems of correlation and curve fitting are properly classified as 
parameter problems (G) but the wealth of literature on this subject 
justifies a separate class. A fundamental paper on nonparametric 
correlation is the treatment of Spearman’s coefficient by Hotelling 
and Pabst [1936]. A small treatise on many rank correlation methods 
was prepared by Kendall [1948a]. Typical papers on curve fitting are 
by K. R. Nair and Shrivistava [1942] and K. R. Nair and Banerjee 
[1942]. 


K. Comparative Studies 


The work of F. N. David and Johnson [195la, 1951b] on the distri- 
bution of the F statistic under non-normal conditions is typical of 
the material given this classification. Emphasis is placed on finding the 
operating characteristics of “normal” statistics under non-normal con- 
ditions rather than on the development of distribution-free statistics. 
These papers are nonparametric, since they show which of the para- 
metric procedures have operating characteristics which are not strongly 
dependent on the specific parametric assumptions that were used in 
their development. 


L. Systematic Statistics 


Mosteller [1946] introduced the term “systematic statistics” when 
referring to linear functions of the order statistics of a sample. Much 
of the theory of these statistics has involved the assumption of nor- 
mality. Nevertheless, these techniques have two things in common with 
nonparametric techniques: ease of computation, and “inefficiency.” 
Dixon and Massey [1951] summarized many of the uses of systematic 
statistics. 


M. Scaling 


Many statistical problems involve the measurement or the com- 
parison of objects where the units of measurement are not well defined, 
for instance in measuring tastes. Hence either artificial scales are de- 











, 1983 


n of 
e of 
ited 
pre- 


1 as 
ject 
tric 
ling 
ods 


rjee 


stri- 
l of 
the 
On- 
ics. 
sr'a- 
gly 
| in 


hen 
uch 
\or- 
ith 
” 


tic 


m- 
ed, 
de- 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 849 


veloped or scales are avoided by using ranks. The reports by Bradley 
and Duncarm [1950], and Bradley and Terry [195la, 1951b] contain 
much information on this subject. 


N. Distribution Theory 


The development of nonparametric procedures involves many dis- 
tribution problems. Mood [1940] found many of the distributions that 
are connected with run theory. Wald and Wolfowitz [1944] developed 
limit theorems needed in the theory of statistics based on the method 
of randomization. Hoeffding and Robbins [1948] gave special central 
limit theorems useful in developing tests of randomness. 


0. Applications 


Since nonparametric statistics has developed recently there have 
been few published applications. However, most theoretical papers 
give illustrations of the methods being presented. 


P. Tables 


Once a nonparametric technique has been developed it is often useful 
to have special tables to facilitate applications. An example is the 
tables by Swed and Eisenhart [1943], giving the distribution of runs. 


X. Miscellaneous 


In spite of the many classifications given, a few papers remain un- 
classified. Papers by Wallis [1942] on the combination of independent 
tests and Tukey [1946] on inequalities for deviations from the median 
are typical of these miscellaneous papers. 


ACKNOWLEDGMENT 


The author wishes to thank Mrs. Yvette B. Cocozzella for her per- 
sistent and conscientious efforts in handling the tedious clerical prepa- 
ration of the bibliographic material. 





MAIN BIBLIOGRAPHY 


A 


Adler, Franz (1951): “Yates’ correction and the statisticians,” Journal of the 
American Statistical Association, 46, 490-501. H 
~ Allinson, V. A., and Bates, C. D. (1944): “The basis of the C.S.M. test-charts and 
methods of calculation,” (British) Ministry of Supply Advisory Service on 
Statistical Method and Quality Control. Technical Report No. Q.C./R/17 
(10 March 1944). H 
Andersen, Erik Sparre (1949): “On the number of positive sums of random vari- 
ables,” Skandinavisk Aktuarietidskrift, 32, 27-36. N (60-256) 
Anderson, Oskar N. (1935): Einfiihrung in die mathematische Statistik, Julius 
Springer, Wien, pp. 95-105, 187-93, 243-47. CEP 
Anderson, T. W. (1943): “On card matching,” Annals of Mathematical Statis- 
tics, 14, 426-35. N [44-208] 
Anderson, T. W., and Darling, D. A. (1952): “Asymptotic theory of certain 
‘goodness of fit’ criteria based on stochastic processes,” Annals of Mathe- 
matical Statistics, 23, 193-212. ENP [63-298] 
André, Désiré (1879): “Développements de séc z et de tang z,” Comptes Rendus 
(Paris), 88, 965-67. N 
(1883a): “Probabilité pour qu’une permutation donnée de n lettres soit 
une permutation alternée,” Comptes Rendus (Paris), 97, 983-84. N 
(1883b): “Sur le nombre des permutations de n éléments qui présentent s 
séquences,” Comptes Rendus (Paris), 97, 1356-58. N 
Anscombe, F. J. (1948): “The validity of comparative experiments,” Journal 
of the Royal Statistical Society, 111, 181-211. A [49-724] 
Arfwedson, G. (1951): “A probability distribution connected with Stirling’s 
second class numbers,” Skandinavisk Aktuarietidskrift, 34, 121-32. 
N [62-9656] 
Armitage, P. (1944): “A go not-go sequential t-test,” (British) Ministry of 
Supply Advisory Service on Statistical Method and Quality Control. Tech- 
nical Report Q.C./R/20 (22 November 1944). G 
Armitage, P., Baines, A. H. J., and Lindley, D. V. (1944): “On some properties 
of binomial sequences,” (British) Ministry of Supply Advisory Service on 
Statistical Method and Quality Control. Technical Report No. Q.C./R/28 
(1 November 1944) IN 
, Azorin, F., and Wold, H. (1950): “Product sums and modulus sum of H. Wold’s 
normal deviates,” Trabajos Estadistica, 1, 5-28; (English and Spanish). 
EOP [62-478] 
B 
Bachelier, Louis (1912): Calcul des probabilités, Vol. I, Gauthier-Villars, Paris; 
pp. 252-53. N 
‘ Baillie, Donald C. (1946): “On testing the significance of mortality ratios by the 
use of x2,” Transactions of the Actuarial Society of America, 47, 326-34. 
O [47-284] 
Baker, G. A. (1946): “Distribution of the ratio of sample range to sample 
standard deviation for normal and combinations of normal distributions,” 
Annals of Mathematical Statistics, 17, 366-69. KLP [47-43] 


850 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 851 


“Baker, G. A., and Guilbert, H. R. (1942): “Non-randomness of variations in 
daily weights of cattle,” Journal of Animal Science, 1, 293-99. O 
Banerjee, D. P. (1952): “On the distribution of the range of variation of the 
ordered variates in samples of n from normal universe,” Proceedings of the 
Indian Academy of Science, Sect. A, 35, 24-26. ~ LN (62-762) 
Barnard, G. A. (1943a): “The use of the median in place of the mean in quality 
control charts,” (British) Ministry of Supply Advisory Service on Quality 
Control. Technical Report No. Q.C./R/2 (19 April 1943). GLO 
(1943b): “Statistical methods applied to assembly processes,” (British) 
Ministry of Supply Advisory Service on Statistical Method and Quality Con- 
trol. Technical Report No. Q.C./R/3—Part I (1 May 1943). co 
——— (1944): “An analogue of Tchebycheff’s inequality in terms of range,” 
(British) Ministry of Supply Advisory Service on Statistical Method and 
Quality Control. Technical Report No. Q.C./R/11 (6 February 1944). C 
(1945): “A new test for 2X2 tables,” Nature, 156, 177. H [46-131] 
(1947a): “Significance tests for 2X2 tables,” Biometrika, 34, 123-38. 
H [47-396] 
(1947b): “2X2 tables. A note on E. S. Pearson’s paper,” Biometrika, 34, 
168-69. H [47-396] 
Bartlett, M. S. (1935): “The effect of non-normality on the ¢-distribution,” Pro- 
ceedings of the Cambridge Philosophical Society, 31, 223-31. K 
(1949): “Fitting a straight line when both variables are subject to error,” 
Biometrics, 5, 207-12. J [60-190] 
(1951): “The frequency goodness of fit test for probability chains,” Pro- 
ceedings of the Cambridge Philosophical Society, 47, 86-95. I [61-512] 
(1952): “A sampling test of the x? theory for probability chains,” Bio- 
metrika, 39, 118-21. I [62-962] 
Bateman, G. (1948): “On the power function of the longest run as a test for 
randomness in a sequence of alternatives,” Biometrika, 35, 97-112. 
INP [48-603] 
Baten, W. D. (1946): “Analysis of scores from sampling tests,” Biometrics, 2, 
11-14 M 
Baten, W. D., and Trout, G. M. (1946): “A critical study of the summation-of- 
difference-in-rank method of determining proficiency in judging dairy 
products,” Biometrics, 2, 67-69. M 
“Bates, Grace E., and Neyman, Jerzy (1951): Discriminatory analysis: VIII. Con- 
tribution to the theory of accident proneness, Part I. An optimistic model of 
the correlation between light and severe accidents, USAF School of Aviation 
Medicine, Randolph Field Texas, Project No. 21-49-004. O 
Baticle, Edgar (1951): “Sur la probabilité des itérations dans le scheme de 
Bernoulli,” Comptes Rendus (Paris), 472-73. N [61-619] 
Battin, I. L. (1942): “On the problem of multiple matching,” Annals of Mathe- 
matical Statistics, 13, 294-305. N [48-102] 
Béjar, Juan (1952): “Maxima and minima of the coefficients of asymmetry and 
kurtosis in finite populations,” Trabajos Estadistica, 3, 3-11; (Spanish, 
French summary). X [53-389] 
Belz, M. H., and Hooke, Robert (1953): “Approximate distribution of extreme 
values of the range,” Annals of Mathematical Statistics, 24, 143-44; abstract. 
LN 








852 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


Benderskii, A. M. (1952): “On the distribution of the absolute value of the maxi- 
mum deviation from the mean in a series of observations,” Doklady Aka- 








demii Nauk SSSR (N.S.), 85, 5-8; (Russian). ~~ LN [52-68] 
Bendersky, L. (1948): “Sur quelques problémes concernant les épreuves répé- 
tées,” Bulletin des Sciences Mathématiques, (2) 72, 99-107. N [49-38 4] 
Bennett, B. M. (1952): “The power function of the Haldane-Smith test,” Annals 
of Mathematical Statistics, 23, 476; abstract. I 
Bennett, Carl A. (1951)? “Application of tests for randomness,” Industrial and 
_ Engineering Chemistry, 43, 2063-67. 10 
(1952): Asymptotic properties of ideal linear estimators, Dissertation, 
University of Michigan. LP 
(1953): “Asymptotic properties of ideal linear estimators,” Annals of 
Mathematical Statistics, 24, 138-39; abstract. L 


Benson, F. (1949): “A note on the estimation of mean and standard deviation 
from quantiles,” Journal of the Royal Statistical Society (B), 11, 91-100. KL 
Berge, P. O. (1932): “Uber das Theorem von Tchebycheff und andere Grenzen 
einer Wahrscheinlichkeitsfunktion,” Skandinavisk Aktuarietidskrift, 14, 65- 
77. CP 
(1937): “A note on a form of Tchebycheff’s theorem for two variables,” 
Biometrika, 29, 405-06. CP 
Berger, Agnes (1951): “On uniformly consistent tests,” Annals of Mathematical 
Statistics, 22, 289-93. B [62-1 43} 
Bergstrém, Harald (1949): “On the central limit theorem in the case of not 
equally distributed random variables,” Skandinavisk Aktuarietidskrift, 32, 








37-62. N [60-266] 
(1951): “On asymptotic expansions of probability functions,” Skan- 
dinavisk Aktuarietidskrift, 34, 1-34. N [62-258] 


Berkson, Joseph (1938): “Some difficulties of interpretation encountered in the 
application of the chi-square test,” Journal of the American Statistical 
Association, 33, 526-36. E 

Berkson, Joseph and Geary R. C. (1941): “Comments on Dr. Madow’s ‘Note on 
tests of departure from normality’ with some remarks concerning tests of 
significance,” Journal of the American Statistical Association, 36, 539-43. E 

Bernstein, S. (1924): “Sur une modification de l’inégalité de Tchebichef,” Ann. 
Sc. Instit. Sav. Ukraine, Sect. Math. 1; (Russian, French Summary). Cc 

(1927): Theory of Probability, Moscow, pp. 159-65. C 

(1937): “Sur quelques modifications de l’inégalité de Tchebycheff,” 
Comptes Rendus ( Doklady) de l’ Académie des Sciences de ? URSS, 17, 279-82. 

vatier C 

Berry, Andrew C. (1941): “The accuracy of the Gaussian approximation to the 

sum of independent variates,” Transactions of the American Mathematical 











Society, 49, 122-36. N [41-228] 
Bertrand, J. (1875): “Note relative au théoréme de M. Bienaymé,” Comptes 
Rendus (Paris), 81, 458 and 459-92. IN 
(1907): Calcul des probabilités, Paris. N 


Besson, Louis (translated and abridged by E. W. Woolard) (1920): “On the com- 
parison of meteorological data with chance results,” Monthly Weather Review 
(U.S. Department of Agriculture), 48, 89-94. IN 

_ Bhate, D. H. (1951): “A note on the estimates of centre of location of symmetrical 

populations,” Calcutta Statistical Association Bulletin, 4, 33-35. L [62-762] 








Oe, By ti a 


—~ ® 63.” 


— ~~ Ss lhULlt( 











BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 853 


/ Bhattacharyya, A. (1943): “On a measure of divergence between two statistical 
populations defined by their probability distributions,” Bulletin of the Cal- 

cutta Mathematical Society, 35, 99-109. F [45-7] 

—— (1946): “On a measure of divergence between two multinomial popula- 
tions, Sankyd, 7, 401-6. F {47-282} 
Bickerstaff, T. A. (1947): Certain order probabilities in non-parametric sampling, 
Dissertation, University of Michigan. NP 
Bienaymé, J. (1853): “Considérations & l’appuie de la découverte de Laplace 
sur la loi des probabilités dans la méthode des moindres carrés,” Comptes 
Rendus (Paris), 37, 309-26. C 
(1867): “Considérations & l’appuie de la découverte de Laplace sur la loi 
des probabilités dans la méthode des moindres carrés,” Journal de Mathé- 


/ 








matiques pures et appliquées, (2) 12, 158. C 
(1875): “Application d’un théoréme nouveau du calcul des probabilités,” 
Comptes Rendus (Paris), 81, 417-23. IN 


¥Bignardi, F. (1947a): “Sur les généralizations de l’inégalité de Bienaymé- 
Tchebycheff dans |’étude des distributions de fréquence,” Ann. Facoltd di 
econ. e commercio, Univ. di Palermo, 1, 47-79. C 
(1947b): “De quelques critéres pour l’application des inégalités de 
Bienaymé: Tchebycheff et de Vinci dans I’étude des distributions de fré- 
quence,” Ann. Facoltd di econ. e commercio, Univ. di Palermo, 1, 84-108. C 
Bilham, E. G. (1926): “Correlation coefficients,” Quarterly Journal of the Royal 





Meteorological Society, 52, 172. INO 
Birnbaum, Z. W. (1948): “On random variables with comparable peakedness,” 
Annals of Mathematical Statistics, 19, 76-81. CNX [48-452] 





(1950): “On the distribution of Kolmogorov’s statistic for finite sample 
size,” Proceedings of the seminar on scientific computation (IBM Corp., 
N. Y.), 33-36, November 1949. EN [62-571] 
(1952): “Numerical tabulation of the distribution of Kolmogorov’s 
statistic for finite sample size,” Journal of the American Statistical Associa- 








tion, 47, 425-41. EP [53-889] 
(1953): “Distribution-free tests of fit for continuous distribution func- 
tions,” Annals of Mathematical Statistics, 24, 1-7. ABE 


Birnbaum, Z. W., Raymond, J., and Zuckerman, H. S. (1947): “A generalization 
of Tshebyshev’s inequality to two dimensions,” Annals of Mathematical 
Statistics, 18, 70-79. C [47-470] 

Birnbaum, Z. W., and Tingey, Fred H. (1951): “One-sided confidence contours for 
probability distribution functions,” Annals of Mathematical Statistics, 22, 
592-96. ENP [2-367] 

Birnbaum, Z. W., and Zuckerman, Herbert S. (1944): “An inequality due to H. 
Hornich,” Annals of Mathematical Statistics, 15, 328-29. X [45-160] 

(1949): “A graphical determination of sample sizes for Wilks’ tolerance 

limits,” Annals of Mathematical Statistics, 20, 313-16. D [49-724] 
“Bliss, C. I., Anderson, E. D., and Marland, R. E. (1943): A technique for testing 
consumer preferences with special reference to the constitutents of ice cream, 
Storrs Agricultural Experiment Station Bulletin, 251, 20 pp. M 
Blomqvist, Nils (1950): “On a measure of dependence between two random 
variables,” Annals of Mathematical Statistics, 21, 593-600. JNP [61-510] 
(1951): “Some tests based on dichotomization,” Annals of Mathematical 
Statistics, 22, 362-71. GN P [62-1 43) 

















854 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


Bonnier, Gert (1952): “The four-fold table and the heterogeneity test,” Science 
(N.S.), 96, 13-14. H [43-26] 
Borel, Emile (1933): “Sur un probléme élémentaire de probabilités et la quasi 
périodicité de certains phénoménes arithmétiques,” Comptes Rendus (Paris), 





196, 881-82. IN 
Borsting, Jack (1952): “On the addition of chi-squares. (Preliminary Report),” 
Annals of Mathematical Statistics, 23, 480; abstract. E 
Bose, R. C. (1946): “The patch number problem,” Science and Culture, 12, 199- 
200. IN [47-389] 
(1950): “On a problem of two dimensional probability,” Sankhyd, 10, 

13-28. IN [61-113] 


Bottema, O., and van Veen, S. C. (1943): “Calculation of probabilities in the 
game of billiards,” Nieuw Archief voor Wiskunde (2), 22, 15-33; (Dutch). 





N [46-209] 
(1946): “Calculation of probabilities in the game of billiards. II,” Nieuw 
_Archief voor Wiskunde (2), 22, 123-158; (Dutch). N [47-470] 


Bowker, Albert H. (1944): “Note on consistency of a proposed test for the prob- 

lem of two samples,” Annals of Mathematical Statistics, 15, 98-101. F [45-10] 

»Bradley, R. A. (1952a): “The distribution of the ¢ and F statistics for a class of 
non-normal populations,” Virginia Journal of Science (N.S.), 3, 1-34. 

K [52-666] 

(1952b): “Corrections for nonnormality in the use of the two-sample t- 

and F-tests at high significance levels,” Annals of Mathematical Statistics, 








23, 103-13. GK [52-666] 
(1953): “Some statistical methods in taste testing and quality evalua- 
tion,” Biometrics, 9, 22-38. AM 


Bradley, R. A., and Duncan, D. B. (1950): Statistical methods for sensory difference 
tests of food quality. Bi- Annual Report No. 1, Mimeographed report from 
the Virginia Agricultural Experiment Station, Blacksburg, Va., December 
1950. AM 

Bradley, R. A., and Terry, M. E. (1951a): Statistical methods for sensory difference 
tests of food quality. Bi-Annual Report No. 2, Mimeographed report from 
Virginia Agricultural Experiment Station, Blacksburg, Va., June 1951. AM 

(1951b): Statistical methods for sensory difference tests of food quality. Bi- 

Annual Report No. 3, Mimeographed report from Virginia Agricultural 

Experiment Station, Blacksburg, Va., December 1951. M 

(1952a): “Rank analysis of incomplete block designs. I. The method of 

paired comparisons,” Annals of Mathematical Statistics, 23, 299-300; ab- 

stract. G 

(1952b): “Rank analysis of incomplete block designs. II. The method for 

blocks of three. (Preliminary Report),” Annals of Mathematical Statistics, 

23, 300; abstract. G 

(1952c): Statistical methods for sensory difference tests of food quality. Bi- 
Annual Report No. 4, Mimeograrhed report from Virginia Agricultural Ex- 
periment Station, Blacksburg, Va., June 1952. G 

Brown, Archibald, and Stanley, P. J. (n.d.): “Relation between chemical analysis 
and ballistic limit for armour plate,” (British) Ministry of Supply Advisory 
Service on Statistical Method and Quality Control. Technical Report No. 
Q.C./E/3. J 




















IR 1983 


\clence 
48-26] 
quasi 
Paris), 
IN 
ort) ,” 
E 

» 199- 
7-389] 
4, 10, 
51-113} 
in the 
utch), 
6-209) 
Nieuw 
7-470] 
prob- 
45-10) 


lass of 


2-666] 
iple t- 
tistics, 
2-665] 
valua- 
AM 
‘erence 
, from 
ember 
AM 
‘erence 
, from 
, ae 
y. Bi- 
tural 
M 
10d of 
0; ab- 
G 

od for 
istics, 
G 

y. Bi- 
al Ex- 
G 
alysis 
visory 
t No. 
J 





eererenehes oe 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 855 


/ Brown, Bernice (1948): Some tests of the randomness of a million digits, The 
_RAND Corporation, California, RAOP-44, (19 October). IO 
Brown, G. W., and Mood, A. M. (1951): “On median tests for linear hypotheses, ” 
Proceedings of the Second Berkeley Symposium on Mathematical Statistics 


and Probability, University of California Press, 159-66. GN 
4) Brown, W. R. J. (1952): “Statistics of color-matching data,” Journal of the 
_ Optical Society of America, 42, 252-56. oO 
Brownlee, John (1924): “Some experiments to test the theory of goodness of fit,” 
Journal of the Royal Statistical Society, 87, 76-82. EO 
Burr, Irving W. (1952): “Distribution of ranges from an arbitrary discrete popu- 
lation,” Annals of Mathematical Statistics, 23, 145; abstract. LN 

Cc 
Cadwell, J. H. (1952): “The distribution of quantiles of small samples,” Bio- 
metrika, 39, 207-11. LNP [62-961] 


Cameron, J. M. (1952): “Results of some tests of randomness on pseudo-random 
numbers. (Preliminary Report),” Annals of Mathematical Statistics, 23, 138; 
abstract. 10 

Camp, Burton H. (1922): “A new generalization of Tchebycheff’s statistical in- 
equality,” Bulletin of the American Mathematical Society, 28, 427-32. CP 

(1923): “Note on Professor Narumi’s paper,” Biometrika, 15, 421-23. C 

(1938): “Further interpretations of the chi-square test,” Journal of the 











American Statistical Association, 33, 537-42. E 
(1942): “Some recent advances in mathematical statistics, I,” Annals of 
Mathematical Statistics, 13, 62-73; see part IV. A [43-24] 


(1946): The effect on a distribution function of small changes in the 
population function,” Annals of Mathematical Statistics, 17, 226-31. 

BK [47-44] 

(1948): “Generalization to N dimensions of inequalities of the Tcheby- 

cheff type,” Annals of Mathematical Statistics, 19, 568-74. CP [49-384] 

/Campbell, W. E. (1942): Use of statistical control in corrosion and contact re- 

sistance studies, Bell Telephone System Technical Publications Monograph 














B-1350. IO 
-Cantelli, F. P. (1910): “Intorno ad un teorema fundamentale della teoria del 
rischio,” Bolletino dell’ Associazione degl. Attuart Italiani (Milan). C 
(1911): “Intorno ad un teorema di calcolo della probabilité,” Giornale di 
matematiche di Battaglini ( Napoli) (3), 49, 341-52. C 
(1928): “Sui confini della probabilité,” Atti del Congresso Internazional 

del Matematici Bologna 3-10 Settembre 1928 (VI), 6, 47-59. AC 
Carlton, A. George (1946): “Estimating the parameters of a rectangular distribu- 
tion,” Annals of Mathematical Statistics, 17, 355-58. N [47-41] 


Carroll, John B., and Bennett, C. C. (1950): “Machine short-cuts in the com- 
putation of chi-square and the contingency coefficient,” Psychometrika, 15, 
441-47. EH 


v Chakrabarti, M. C. (1946): “A note on skewness and kurtosis,” Bulletin of the 


Calcutta Mathematical Society, 38, 133-36 X [47-393] 
(1947): “On the inadequacy of measuring the peakedness of a distribution 





curve by the standardised fourth moment,” Bulletin of the Calcutta Mathe- 
matical Society, 39, 154-56. X [49-50] 











856 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Chandler, K. N., (1952): “The distribution and frequency of record values,” 
Journal of the Royal Statistical Society (B), 14, 220-28. LNP 
Chandra Sekar, C., and Francis, Mary G. (1941): “A method to get the sig. 
nificance limit of a type of test criteria,” Sankhyd, 5, 165-68. LN [43-165] 
Chapelon, Jacques M. (1937): “Sur l’inégalité fondamentale du calcul des 
probabilités,” Bulletin. de la Société Mathématique de France (Paris), 65, 100- 
48. Cc 
Chaplin, W. S. (1880): “The relation between the tensile strengths of long and 
short bars,” Van Nostrand’s Engineering Magazine, 23, 441-44. LO 
(1882): “On the relative tensile strengths of long and short bars,” Pro. 
ceedings of the Engineers’ Club, Philadelphia, 3, 15-28. LO 
. Chapman, Dwight (1934): “The statistics of the method of correct matchings,” 
American Journal of Physiology, 46, 287-98. MN 
(1935): “The generalized problem of correct matchings,” Annals of 
Mathematical Statistics, 6, 85-95. MNP 
Charley, Helen (1950): “Effect of baking pan material on heat penetration dur- 
ing baking and on quality of cakes made with fat,” Food Research, 15, 155- 
67. 0 
(1952): “Effects of internal temperature and of oven temperature on the 
cooking losses and the palatability of baked salmon steaks,” Food Re- 
search, 17, 136-43. 0 
Chown, L. N., and Moran, P. A. P. (1951): “Rapid methods for estimating cor- 
relation coefficients,” Biometrika, 38, 464-67. J [52-667] 
Chung, Kai Lai (1946): “The approximate distribution of Student’s statistic,” 
Annals of Mathematical Statistics, 17, 447-65. K [47-283] 
(1948): “On the maximum partial sums of sequences of independent ran- 
dom variables,” Transactions of the American Mathematical Society, 64, 
205-33; see 215-16. N[49-132] 
(1949): “An estimate concerning the Kolmogoroff limit distribution,” 
Transactions of the American Mathematical Society, 67, 36-50. EN [50-606] 
Chung, Kai Lai, and Feller, W. (1949): “On fluctuations in coin tossing,” Pro- 
ceedings of the National Academy of Science (USA), 35, 605-08. N [50-444] 
Clark, A. L. (1933): “An experimental investigation of probability,” Canadian 




















_Journal of Research, 9, 402-14. NO 
(1934): “Experimental probability,” Canadian Journal of Research, 11, 
658-64. 0 


Clopper, C. J., and Pearson, Egon S. (1934): “The use of confidence or fiducial 
limits illustrated in the case of the binomial,” Biometrika, 26, 404-13. P 
Cochran, W. G. (1936): “Statistical analysis of field counts of diseased plants,” 





Journal of the Royal Statistical Society (B), 3, 49-67. IKO 
(1937a): “The x? distribution for the binomial and Poisson series, with 
small expectations,” Annals of Eugenics, 7, 207-14. E 





(1937b): “The efficiencies of the binomial series test of significance of a 
mean and of a correlation coefficient,” Journal of the Royal Statistical 
Society, 100, 69-73. GJ 
(1938): “An extension of Gold’s method of examining the apparent per- 
sistence of one type of weather,” Quarterly Journal of the Royal Meteorologi- 
cal Society, 64, 631-34. INO 
(1941): “The distribution of the largest of a set of estimated variances as & 
fraction of their total,” Annals of Eugenics, 11, 47-52. L [42-171] 











ER. 1953 


alues,” 
LNp 
he sig. 
43-165] 
ul des 
5, 100- 
C 
png and 
LO 
1” Pro 
LO 
hings, ” 
MN 
nals of 
MNP 
on dur. 
5, 155- 
0 
on the 
rd Re- 
0 
1g cor- 
2-667] 
istic,” 
7-283] 
t ran- 
y, 64, 
9-132 
tion,” 
-6.06) 
Pro- 
444) 
rdian 
NO 
» 11, 
0 
ucial 
P 
ots,” 
[KO 
with 
E 
of a 
ical 
GJ 
Der- 
ogi- 
NO 
4S & 
71] 














BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 857 


_—— (1942): “The x? correction for continuity,” Iowa State College J ournal of 
Science, 16, 421-36. EN [43-280] 
__— (1943): “Analysis of variance for percentages based on unequal num- 
bers,” Journal of the American Statistical Association, 38, 287-301. G [45-92] 
__— (1950): “The comparison of percentages in matched samples,” Bio- 
metrika, 37, 256-66. GN [61-621] 
—— (1952): “The x? test of goodness of fit,” Annals of Mathematical Sta- 
tistics, 23, 315-45. AEH {62-190) 

/ Cole, LaMont C. (1945): “A simple test of the hypothesis that alternative events 
are equally probable,”_ Ecology, 26, 203-05. GOP 
Cole, Randall H. (1949): An R-ply range estimation of mean and standard devia- 
tion, Mimeographed Report No. 20, Statistical Research Group, Princeton 





University. LP 
(1951): “Relations between moments of order statistics,” Annals of 
Mathematical Statistics, 22, 308-10. L (61-841) 


Cowles, Alfred, and Jones, Herbert E. (1937): “Some a posteriori probabilities in 
stock market action,” Econometrica, 5, 280-94. IO 

Cox, D. R. (1948): “A note on the asymptotic distribution of range, ” Biometrika, 
35, 310-15. LP [49-466] 
/Craig, Allen T. (1932): “On the distribution of certain statistics,” American 
Journal of Mathematics, 54, 353-66. L 
Craig, C. C. (1933): “On the Tchebycheff inequality of Bernstein,” Annals of 
Mathematical Statistics, 4, 94-102. Cc 
(1942): “Some recent advances in mathematical statistics, II,” Annals of 
Mathematical Statistics, 13, 74-85. A [43-26] 
(1953): “Combination of neighboring cells in contingency tables,” Journal 
of the American Statistical Association, 48, 104-12. H 
Cramér, Harald (1924): “Remarks on correlation,” Skandinavisk Aktuartetid- 
skrift, 7, 220—40. J 
(1928a): “On the composition of elementary errors. First paper: Mathe- 
matical deductions,” Skandinavisk Aktuarietidskrift, 11, 13-74. N 
(1928b): “On the composition of elementary errors. Second paper: Statisti- 
cal applications,” Skandinavisk Aktuarietidskrift, 11, 141-80. ENO 
(1946): Mathematical methods of statistics, Princeton University Press, 
Princeton, N. J., pp. 182-83, 256, 367-78. CLMN [47-49] 
/Crist, J. W. (1940): “Correlation from ranks for horticultural research,” Proceed- 
ings of American Society of Horticultural Science, 38, 593-95. JO 
Crist, J. W., and Seaton, H. L. (1941): “Reliability of organoleptic tests,” Food 
Research, 6, 529-36. MO 
~Cronbach, Leo J., and Glosser, Goldine C. (1952): Similarity between persons 
and related problems of profile analysis, Bureau of Research and Service, 
College of Education, University of Illinois, Urbana, Illinois, ‘Technical 
Report No. 2. J 
Crow, Edwin L. (1952): “Some cases in which Yates’ correction should not be 
applied,” Journal of the American Statistical Association, 47, 303-04. H 

v Curtiss, J. H. (1950): “Lot quality measured by average or variability,” Accep- 
tance Sampling, The American Statistical Association, Washington, D. C., 
79-116, C 


























858 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


D 


Daly, Joseph F. (1946): “On the use of the sample range in an analogue of Stu- 
dent’s t-test,” Annals of Mathematical Statistics, 17, 71-74. LP [46-464] 
Daniels, H. E. (1943): “The relation between measures of correlation in the 
universe of sample permutations,” Biometrika, 33, 129-35. J [45:91] 
(1950): “Rank correlation and population models,” Journal of the Royal 
Statistical Society (B), 12, 171-81. J (51-725) 
(1951): “Note on Durbin and Stuart’s formula for E(r,),” Journal of the 
Royal Statistical Society (B), 13, 310. JN 
Daniels, H. E., and Kendall, M. G. (1947): “The significance of rank correlations 
where parental correlation exists,” Biometrika, 34, 197-208. JN [48-364] 
van Dantzig, D. (1951a): “On the consistency and the power of Wilcoxon’s two 
sample test,” Proceedings Koninklijke Nederlandse Akademie van Weten- 











schappen (A), 54=Indagationes Mathematicae, 13, 1-8. F [51-726] 
(1951b): “Une nouvelle généralisation de l’inégalité de Bienaymé,” An- 
nales de l’ Institut Henri Poincaré, 12, 31-43. C [62-51] 





(1952): “Wiskundige consultatie ten behoeve van medeisch, biologisch en 
ander onderzoek (Mathematical consultation for medical, biological and 
other research),” Verslag Koninklijke Nederlandse Akademie van Weten- 
schappen, 61, 13-18. 0 

Dantzig, George B. (1939): “On a class of distributions that approach the normal 
distribution function,” Annals of Mathematical Statistics, 10, 247-53. 


IN [40-21] 
Darling, D. A. (1951): “Sums of symmetrical random variables,” Proceedings of 
the American Mathematical Society, 2, 511-17. N [52-258] 





(1952a): “The influence of the maximum term in the addition of inde- 
pendent random variables,” Transactions of the American Mathematical 





Society, 73, 95-107. LN [58-60] 
(1952b): “On a test for homogeneity and extreme values,” Annals of 
Mathematical Statistics, 23, 450-56. LN [58-298] 


David, F. N. (1934): “On the Pj, test for randomness: remarks, further illustra- 
tion, and table of Pj, for given values of —logiodm,” Biometrika, 26, 1-11. 





ENP 
(1938): “Limiting distributions connected with certain methods of 
sampling human populations,” Statistical Research Memoirs, 2, 69. N 





(1939): “On Neyman’s ‘smooth’ test for goodness of fit I. Distribution of 
the criterion ¥? when the hypothesis is true,” Biometrika, 31, 191-99. 

















E NP [61-53] 

(1947a): “A x? ‘smooth’ test for goodness of fit,” Biometrika, 34, 299-310. 

ENP [48-600] 

(1947b): “A power function for tests of randomness in a sequence of al- 

ternatives,” Biometrika, 34, 335-39. INP [48-600] 
(1948): “Correlations between x? cells,” Biometrika, 35, 418-22. 

EN [49-465] 

(1949): “Note on the application of Fisher’s k-statistics,” Biometrika, 36, 

383-93. KP [60-447] 

(1950a): “Two combinatorial tests of whether a sample has come from 

@ given population,” Biometrika, 37, 97-110. EP [61-38] 





(1950b): “An alternative form of x?,” Biometrika, 37, 448-51 E [51-345] 











BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 859 


__——. (1950c): Review of “Rank correlation methods” by M. G. Kendall, 
Biometrika, 37, 190. J 
David, F. N., and Johnson, N. L. (1948): “The probability integral transforma- 
tion when parameters are estimated from the sample,” Biometrika, 35, 182- 


90. EN [49-61] 
_—_——— (1951a): “The effect of non-normality on the power function of the F-test 
in the analysis of variance,” Biometrika, 38, 43-57. GKN [62-53] 


—_—— (1951b): “A method of investigating the effect of nonnormality and 
heterogeneity of variance on tests of the general linear hypothesis,” Annals 
of Mathematical Statistics, 22, 382-92. GKN [62-1 43] 
—— (1951c): “The sensitivity of analysis of variance tests with respect to 
random variation between groups,” Trabajos Estadistica, 2, 179-88; (Eng- 
lish, Spanish summary). GK [62-57 2] 
(1952): “Extension of a method of investigating the properties of analysis 
of variance tests to the case of random and mixed models,” Annals of 





Mathematical Statistics, 23, 594-601. GKN [53-488] 
David, H. A. (1951): “Further applications of range to the analysis of variance,” 
Biometrika, 38, 393-409. GLP [52-668] 


David, S. T., Kendall, M. G., and Stuart, A. (1951): “Some questions of dis- 
tribution in the theory of rank correlation,” Biometrika, 38, 131-40. 

JNP [62-85] 

Davies, O. L., and Pearson, E. S. (1934): “Methods of estimating from samples 

the population standard deviations,” Journal of the Royal Statistical Society 


B, 1, 76-93. KL 
Davis, D. J. (1952): “An analysis of some failure data,” Journal of the American 
Statistical Association, 47, 113-50. O 


y Davis, H. T. (1941): The statistics of time series, Northwestern University 
Studies in Mathematics and the Physical Sciences, no. 1: Mathematical 
Monographs, 1, 45-85. I [42-175] 

Dawson, E. H., Duehring, M., and Parks, V. E. (1947): “Addition of ground egg 
shell dried egg for use in cooking,” Food Research, 12, 288-97. O 
Deming, W. Edwards (1938): “Some thoughts on curve fitting and the chi test,” 
Journal of the American Statistical Association, 33, 543-51. J 
Dixon, W. J. (1940): “A criterion for testing the hypothesis that two samples 
are from the same population,” Annals of Mathematical Statistics, 11, 199- 

204. FNP [41-111] 
(1952): “Power efficiency function for normal alternatives for several non- 
parametric tests,” Annals of Mathematical Statistics, 23, 475; abstract. G 
Dixon, W. J., and Massey, F. J. (1951): An introduction to statistical analysis, 





McGraw-Hill Book Co., N.Y.; see chapters 13, 15, and 17. A 
Dixon, W. J., and Mood, A. M. (1946): “The statistical sign test,” Journal of the 
American Statistical Asseciation, 41, 557-66. GP 


Dodd, Edward L. (1923): “The greatest and the least variate under general laws 

of error,” Transactions of the American Mathematical Society, 25, 525-39. LNP 
(1939): “The length of the cycles which result from the graduation of 
chance elements,” Annals of Mathematical Statistics, 10, 254-64. I [40-23] 
(1941): “The problem of assigning a length to the cycle to be found in a 
simple moving average and in a double moving average of chance data,” 
Econometrica, 9, 25-37 I [41-237] 











860 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


(1942): “Certain tests for randomness applied to data grouped into smal] 

sets,” Econometrica, 10, 249-57. I [48-108] 
Domb, C. (1947): “The problem of random intervals on a line,” Proceedings of 
the Cambridge Philosophical Society, 43, 329-41. N (47-591) 
(1952): “On the use of a random parameter in combinatorial problems,” 

The Proceedings of the Physical Society Section A, 65, 305-09. N [52-186] 
Donsker, Monroe D. (1952): “Justification and extension of Doob’s heuristic ap- 
proach to the Kolmogorov-Smirnov theorems,” Annals of Mathematical 
Statistics, 23, 277-81. EN [62-853] 
Doob, J. L. (1949): “Heuristic approach to the Kolmogorov-Smirnov theorems,” 
Annals of Mathematical Statistics, 20, 393-403. EFN [50-43] 
(1951): “Continuous parameter martingales,” Proceedings of the Second 
Berkeley Symposium on Mathematical Statistics and Probability, University 

of California Press, 269-77; see section 3. C (62-476) 
(1953): Stochastic Processes, John Wiley and Sons, New York, pp. 1-654. 

CN 

Drion, E. F. (1951): “Estimation of the parameters of a straight line and of the 
variances of the variables, if they are both subject to error,” Proceedings 
Koninklijke Nederlandse Akademie van Wetenschappen (A), 54 = Indaga- 
tiones_Mathematicae, 13, 256-60. GJ [62-144] 
(1952): “Some distribution-free tests for the difference between two em- 

pirical cumulative distribution functions,” Annals of Mathematical Sta- 
tistics, 23, 563-74. F NP [53-488} 
DuBois, Philip (1935): “Note on the calculation of the chi-square test for ‘good- 
ness of fit’,” Psychometrika, 4, 173-74. EL 
(1939): “Formulas and tables for rank correlation,” Physiological 
Record, 3, 45-56. JP 
Duncan, A. J. (1952): Quality control and industrial statistics, Richard D. Irwin, 
Inc., Chicago, Illinois. P 
Durbin, J. (1951): “Incomplete blocks in ranking experiments,” The British 
Journal of Psychology (Statistical Section), 4, 85-90. GM [62-963] 
Durbin, J., and Stuart, A. (1951): “Inversions and rank correlation coefficients,” 
Journal of the Royal Statistical Society (B), 13, 303-09. J N[62-963]) 
Dwass, Meyer (1952): Contributions to the theory of rank order tests, Dissertation, 
University of North Carolina. GL 
Dyson, F. J. (1943): “A note on kurtosis,” Journal of the Royal Statistical Society 
(N.S.), 106, 360-61. X [46-162] 


E 


Eddison, R. T., et al. (1951): “Keeping quality and raw-milk grading,” Journal 
of Dairy Research, 18, 43-71. GM 
Eden, J., and Yates, F. (1933): “On the validity of Fisher’s z test when applied 
to an actual example of non-normal data,” Journal of Agricultural Science, 
23, 6-13. K 
Edwards, A. L. (1950): “On ‘the use and misuse of the chi-square test’—the case 
of the 2X2 contingency table,” Psychologi-al Bulletin, 47, 341-46. EH 
Eells, Walter Crosby (1929): “Formulas for probable errors of coefficients of cor- 
relation,” Journal of the American Statistical Association, 24, 170-77. J 
Eggleton, Philip, and Kermack, William Ogilvie (1944): “A problem in the ran- 





ngs of 
7-5 91) 
ems,” 
?-1 86] 
ic ap- 
atical 
?-853] 
ms,” 
0-43} 
cond 
rsity 
476 
654, 
CN 
the 
lings 
laga- 
144] 
em- 
Sta- 
488] 
0d- 


BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 861 


Section A, 62, 103-15. N [45-88] 
Egudin, G. I. (1947): “Certain relations between the moments of the distribution 
of extreme values in random samples,” Doklady Akademiya Nauk SSSR 
(N.S.), 58, 1581-84. L [48-295] 
Ehrenberg, A. S. C. (1951): “Note on normal transformations of ranks,” British 
Journal of Psychology (Statistical Section), 4, 133-34. G 
——_— (1952): “On sampling from a population of rankers,” Biometrika, 39, 
82-87. MN [62-64] 
Eisenhart, Churchill (1935): “A test for the significance of lithological varia- 
tions,” Journal of Sedimentary Petrology, 5, 137-45. EHO 
—— (1937): A contribution to the theory of testing statistical hypotheses, Dis- 
sertation, University College, London. EN 
(1938): “The power function of the chi-square test,” Builetin of the 
American Mathematical Society, 44, 32; abstract. EN 
——— (1947): “The assumptions underlying the analysis of variance,” Bio- 
metrics, 3, 1-21. GK [47-693] 
Eisenhart, Churchill, Deming, Lola S., and Martin, Celia S. (1948): “The prob- 
ability points of the distribution of the median in random samples from any 
continuous population,” Annals of Mathematical Statistics, 19, 598-99; ab- 
stract. LN 
Eisenhart, Churchill, and Wilson, Perry W. (1943): “Statistical methods and 
control in bacteriology,” Bacteriological Review, 7, 57-137. EIO 
Elderton, W. Palin (1901): “Tables for testing the goodness of fit of theory to 
observation,” Biometrika, 1, 155-63. EP 
(1927): Frequency curves and correlation, Charles and Edwin Layton, 
London, 2nd edition, chapter 11. E 
Elfving, G., and Whitlock, J. H. (1950): “A simple trend test with application to 
erythrocyte size data,” Biometrics, 6, 282-88. IO 
Epstein, Benjamin (1948): “Application of the theory of extreme values in frac- 
ture problems,” Journal of the American Statistical Association, 43, 403-12. 

LO 

(1949): “The distribution of extreme values in samples whose members 

are subject to a Markoff chain condition,” Annals of Mathematical Sta- 
tistics, 20, 590-94. LN [60-376] 
(1951): “Correction to ‘The distribution of extreme values in samples 

whose members are subject to a Markoff chain condition’, ” Annals of 
Mathematical Statistics, 22, 133-34. L 
(1952): Estimates of mean life based on the r’th smallest value in a sample 

of size n drawn from an exponential distribution, Mimeographed report from 

the Department of Mathematics, Wayne University, Detroit, Michigan. 
Prepared under contract Nonr-451(00)(NR-042-017) for Office of Naval 
Research. LP 
(1953): “A nonparametric two-sample life test,” Annals of Mathematical 
Statistics, 24, 142-43; abstract. G 
Epstein, Benjamin, and Sobel, M. (1952a): Some tests based on the first r ordered 
observations drawn from an exponential distribution, Stanford University 
Technical Report No. 6, Wayne University Technical Report No. 1, March 

1, 1952. LP 





862 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


(1952b): “Some tests based on the first r ordered observations drawn 

from an exponential distribution,” Annals of Mathematical Statistics, 23, 
143-44; abstract. L 
Erdis, P., and Kac, M. (1947): “On the number of positive sums of independent 
random variables,” Bulletin of the American Mathematical Society, 53, 
1011-20. N [48-299] 
Esscher, F. (1924): “On a method of determining correlation from the ranks of 
the variates,” Skandinavisk Aktuarietidskrift, 7, 201-19. JNO 
Essen, Carl-Gustav (1942): “On the Liapounoff limit of error in the theory of 
probability,” Arkiv for Matematik, Astronomi och Fysik, 28A, No. 9, 19 pp. 

N [45-232] 

Evans, W. Duane (1942): “The standard error of percentiles,” Journal of the 
American Statistical Association, 37, 367-76. GN [43-103] 
Eysenck, H. J. (1939): “The validity of judgments as a function of the number of 
judges,” Journal of Experimental Psychology, 25, 650-54. MOP 


F 


Faber, A. (1922): “Ueber nach Polynomen fortschreitende Reihen,” Sitzungs- 
berichte der Mathematisch- Naturwissenschaftlichen Klasse der Bayerischen 
Akademie der Wissenschaften zu Miinchen, 157-78. C 

Federighi, Enrico (1950): “The use of chi-square in small samples,” American 
Sociological Review, 15, 777-79. HN 

Feller, W. (1938): “Note on regions similar to the sample space,” Statistical 
Research Memoirs, 2, 117-25. B 

(1945): “The fundamental limit theorems in probability,” Bulletin of 
the American Mathematical Society, 51, 800-32, Section 5. N [49-310] 
(1948): “On the Kolmogorov-Smirnov limit theorems for empirical dis- 
tributions,” Annals of Mathematical Statistics, 19, 177-89. EF WN [48-599] 
(1950): An introduction to probability theory and its applications. Vol. 1, 
John Wiley and Sons, Inc., New York; chapter 13. N [61-424] 
(1951): “The asymptotic distribution of the range of sums of independent 

random variables,” Annals of Mathematical Statistics, 22, 427-32. 
LN [62-1 40] 

Feuell, A. J., and Rybicka, S. M. (1951): “Quality control chart based on good- 
ness-of-fit test,” Nature, 167, 194-95. E 

Festinger, Leon (1943): “A statistical test for means of samples from skew 
populations,” Psychometrika, 8, 205-10. GK [44-128] 

(1946): “The significance of difference between means without reference 
to the frequency distribution function,” Psychometrika, 11, 97-105. 
GP [47-43] 

Fiedler, Fred E., Hartman, Walter, and Rudin, Stanley A. (1952): The relation- 
ship of interpersonal perception to effectiveness in basketball teams, Technical 
Report No. 3, Contract N6ori-07135 with the Office of Naval Research, 
University of Illinois, Urbana, Illinois. 0) 

Finch, D. J. (1950): “The effect of non-normality on the z-test, when used to 
compare the variances in two populations,” Biometrika, 37, 186-89. 

KP [61-38] 

Finney, D. J. (1942): Review of “Significance test for time series” by W. A. 
Wallace and G. Moore, ” Annals of Eugenics, 11, 308. I 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 863 


_—- (1947): “The significance of associations in a square point lattice,” 
Journal of the Royal Statistical Society (B), 9, 99-103. IN [48-291] 
——— (1948): “The Fisher-Yates test of significance in 2X2 contingency 
tables,” Biometrika, 35, 145-56. HP [49-62] 
Fisher, R. A. (1922): “On the interpretation of chi-square from contingency 
tables, and the calculation of p,” Journal of the Royal Statistical Society, 85, 
87-94. HN 
—_— (1923): “Statistical tests of agreement between observation and hy- 
pothesis,” Economica, 3, 139-47. EN 
—— (1924): “The conditions under which x? measures the discrepancy be- 
tween observation and hypothesis,” Journal of the Royal Statistical Society, 
87, 442-50. E 
——— (1925): “Tests of significance in harmonic analysis,” Proceedings of the 
Royal Society of London (A), 125, 54-59. 
(1926a): “Bayes’ theorem and the fourfold table,” The Eugenics Review, 
18, 32-33. HN 
(1926b): “On the random sequence,” Quarterly Journal of the Royal 
Meleorological Society, 52, 250. IN 
(1928): “On a property connecting the x? measure of discrepancy with 
the method of maximum likelihood,” Atti del Congresso Internazional det 
Matematici Bologna, Settembre 1928 (VI) 6, 3-10. EN 
(1935): “The logic of inductive inference,” Journal of the Royal Statistical 
Society, 98, 39-82; example 1. H 
(1941): “The interpretation of experimental four-fold tables,” Science 
(N.S.), 94, 210-11. H [43-26] 
(1945): “A new test for 2X2 tables,” Nature, 156, 388. H 
(1948): Statistical methods for research workers, Hafner Publishing Co., 
Inc., N. Y., 10th edition, section 24, example 19. G 
(1949): The design of experiments, Hafner Publishing Co., Inc., N. Y., 5th 
edition; section 21. G 
(1950): “The significance of deviations from expectation in a Poisson 
series,” Biometrics, 6, 17-24. EP 
Fisher, R. A., and Tippett, L. H. C. (1928): “Limiting forms of the frequency 
distribution of the largest or smallest member of a sample,” Proceedings 
of the Cambridge Philosophical Society, 24, 180-90. LNP 
Fisher, R. A., and Yates, Frank (1948): Statistical tables for biological, agricul- 
tural and medical research. $rd ed., Oliver and Boyd, London. LP [49-740] 
Fiske, Donald W., and Dunlap, Jack W. (1945): “A graphical test for the sig- 
nificance of differences between frequencies from different samples,” Psy- 
chometrika, 10, 225-29. E 
Fix, Evelyn (1949): “Tables of noncentral x?,” University of California Pub- 
lications in Statistics, 1, 15-19. P [51-344] 
Fix, Evelyn, and Hodges, J. L., Jr. (1951): Discriminatory analysis; nonpara- 
metric discrimination( consistency properties, USAF School of Aviation 
Medicine, Randolph Field, Texas, Project Number 21-49-004, Report 
Number 4. G 
(1952): Discriminatory analysis; nonparametric discrimination: small 
sample performance, USAF School of Aviation Medicine, Randolph Field, 
Texas, Project Number 21-49-004, Report Number 11. GP 











864 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


Fraser, D. A. S. (1950): “Note on the x? smooth test,” Biometrika, 37, 447-48, 





EN (61-345 
(1951): “Sequentially determined statistically equivalent blocks,” An. 
nals of Mathematical Statistics, 22, 372-81. D [62-260] 


(1952): “Nonparametric theory: Confidence regions and tests for location 
and scale parameters. (Preliminary Report),” Annals of Mathematical 
Statistics, 23, 477; abstract. 

(1953a): “Nonparametric tolerance regions,” Annals of Mathematical 














Statistics, 24, 44-55. D 
(1953b): “Completeness of the order statistics in the nonparametric 

case,” Annals of Mathematical Statistics, 24, 139; abstract. B 
Fraser, D. A. S., and Wormleighton, R. (1951): “Nonparametric estimation 
IV,” Annals of Mathematical Statistics, 22, 294-98. D [61-8 41} 
Fréchet, M. (1927): “Sur la loi de probabilité de l’écart maximum,” Annales de 
la Société Polonaise de Mathématiques, 6, 93-116. LN 
(1931): “Le generalizzazione delle ineguaglianza di Bienaymé,” Giornale 

dell Instituto Italiano degli Attuari, Anno II, 22-36. C 


(1947): “The general relation between the mean and the mode for a 
discontinuous variate,” Annals of Mathematical Statistics, 18, 290-93. 








X [48-45] 
(1950): Recherches théoriques modernes sur la théorie des probabilités, 
Gauthier-Villars, Paris (Deuxiéme édition), 130-68. c 


Freeman, G. H., and Halton, J. H. (1951): “Note on an exact treatment of 
contingency, goodness of fit and other problems of significance,” Bio- 


metrika, 38, 141-49. GH [62-1 45] 
Freund, John (1951): “The transfer distribution,” Mathematics Magaéine, 25, 
63-66. IN 


Friedman, Milton (1937): “The use of ranks to avoid the assumption of normality 
implicit in the analysis of variance,” Journal of the American Statistical 
Association, 32, 675-701. GNP 

(1940): “A comparison of alternative tests of significance for the problem 
of m rankings,” Annals of Mathematical Statistics, 11, 86-92. M NP [40-348] 

Frisch, Ragnar (1926): “Impossibilité de Resserrer l’inégalité de Markov dans le 
cas général,” Comptes Rendus du Sizitme Congres des Mathématiciens 
Scandinaves tenu 4 Copenhague du 31 aott au 4 septembre 1925, 203-06. C 

Fry, Thornton C. (1938): “The x?-test of significance,” Journal of the American 
Statistical Association, 33, 513-25. E 

Fulcher, John S. (1942): “The item analyzer: a mechanical device for treating 

the four fold table in large samples,” Journal of Applied Psychology, 26, 511-22. 
HP 





G 


Gage, Robert (1943): “Contents of Tippett’s ‘Random Sampling Numbers’,” 
Journal of the American Statistical Association, 38, 223-27. JO [43-223] 
Gardner, A. (1952): “Greenwood’s ‘problem of intervals’: an exact solution for 
N =3,” Journal of the Royal Statistical Society (B), 14,135. NP [53-186] 
Gardner, Robert S. (1950): “A non-parametric test for the hypothesis that two 
bivariate samples come from the same population,” Appendiz from Tech- 
nical Memo. No. 4542-33 Naval Ordnance Test Station, Inyokern, Cali- 
fornia. FN [61-193] 











1953 


~48, 
345] 
An- 
60) 
‘ion 
ical 


eal 
Tic 


on 


41} 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 865 
/Gart*tein, B. N. (1948): “On certain limit laws for the range,” Doklady Akademi- 


ya Nauk SSSR (N.S.), 60, 1119-21; (Russian). LN [49-61] 
Gauss, C. F. (1821): “Theoria combinationis observationum,” Werke—Volume 
4; Kaestner, Gottingen, 1880; pp. 10-11; (Latin). C 


Gayen, A. K. (1949): “The distribution of ‘Student’s’ ¢ in random samples of 
any size drawn from non-normal universes,” Biometrika, 36, 353-69. 

KP [60-447] 

(1950a): “The distribution of the variance ratio in random samples of 
any size drawn from non-normal universes,” Biometrika, 37, 236-55. 








GP [61-344] 
(1950b): “Significance of difference between the means of two non- 
normal samples,” Biometrika, 37, 399-408. GKP [61-345] 


(1951): The frequency distribution of the product-moment correlation 
coefficient in random samples of any size drawn from non-normal uni- 





verses,” Biometrika, 38, 219-47. JNP [62-86] 
Geary, R. C. (1936): “The distribution of ‘Student’s’ ratio for non-normal 
samples,” Journal of the Royal Statistical Society (B), 3, 178-84. KN 





(1947): “Testing for normality,” Biometrika, 34, 209-42. EKP [48-364] 
(1953): “Non-linear functional relationship between two variables when 
one variable is controlled,” Journal of the American Statistical Association, 
48, 94-103. J 
Gebelein, Hans (1942): “Verfahren zur Beurteilung einer sehr geringen Korre- 
lation zwischen zwei statistischen Merkmalsreihen,” Zetischrift fur ange- 
wandte Mathematik und Mechanik, 22, 286-98. H [45-6] 
Geppert, Maria-Pia (1944): “Uber den Vergleich zweier beobachteter Hiufig- 
keiten,” Deutsche Mathematik, 7, 553-92. H [47-398] 
Ghosh, M. N. (1948): “A test for field uniformity based on the space correlation 
method,” Sankhyd, 9, 39-46. IN [49-389] 
Gihman, I. I. (1952): “On the empirical distribution function in the case of 
grouping of the data,” Doklady Akademii Nauk SSSR (N.S.), 82, 837-40; 
(Russian). F [62-666] 





_/ Gildemeister, M., and van der Waerden, B. L. (1943-44): “Die Zulassigkeit des 


x?-Kriteriums fiir kleine Versuchszahlen,” Berichte tiber die Verhandlungen 
der Sdchsischen Akademie der Wissenschaften zu Leipzig. Mathematisch- 
Naturwissenschaftliche Klasse, 95, 145-50. HN [47-394] 
Gleissberg, W. (1945a): “Eine Aufgabe der Kombinatorik und Wahrschein- 
lichkeitsrechnung,” Revue de la Faculté des Sciences de l'Université d’Is- 
tanbul (A), 10, 25-35; (German, Turkish summary). INP [46-406] 
(1945b): “Ein Kriterium fiir die Realitaét zyklischer Variationen,” Revue 
de la Faculté des Sciences de l’ Université d’ Istanbul (A), 10, 36-42; (German, 
Turkish summary). I [47-40] 
(1947): “Bedingungen fiir die Anordnung zufalliger Fehler,” Revue de 
la Faculté des Sciences de l’ Université d’Istanbul (A), 12, 107-126; (Ger- 
man, Turkish and English summaries). N [47-690] 
Glivenko, V. (1933): “Sulla determinazione empirica delle leggi di probabilita,” 
Giornale dell’ Instituto Italiano degli Aliuari, 4, 92. E 
Gnedenko, B. V. (1943): “Sur la distribution limite dur terme maximum d’une 
série aléatorie,” Annals of Mathematics (2), 44, 423-53. LN [44-41] 
(1952): “Some results on the maximum discrepancy between two em- 




















866 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10933 


pirical distributions,” Doklady Akademii Nauk SSSR (N.S.), 82, 661-63; 
(Russian). FN [52-760} 
Gnedenko, B. V., and Korolyuk, V. S. (1951): “On the maximum discrepancy 
between two empirical distributions,” Doklady Akademii Nauk SSSR 
(N.S.), 80, 525-28; (Russian). FN (52-570) 
Gnedenko, B. V., and Mihalevi¢, V. S. (1952a): “On the distribution of the 
number of excesses of one empirical distribution function over another,” 
Doklady Akademii Nauk SSSR (N.S.), 82, 841-43; (Russian). 
FN [52-760] 
(1952b): “Two theorems on the behavior of empirical distribution func- 
tions,” Doklady Akademii Nauk SSSR (N.S.), 85, 25-27; (Russian). 
FN (62-60) 
Gnedenko, B. V., and Rvateva, E. L. (1952): “On a problem of comparison of 
two empirical distributions,” Doklady Akademii Nauk SSSR (N.S.), 82, 
513-16; (Russian). FN [62-760] 
Godwin, H. J. (1944): “Inequalities related to Tchebycheff’s inequality,” 
(British) Ministry of Supply Advisory Service on Statistical Method and 
Quality Control. Technical Report No. Q.C./R/13 (14 February 1944). AC 
(1948): “A further note on the mean deviation,” Biometrika, 35, 304-09. 
LN [49-387] 
(1949a): “On the estimation of dispersion by linear systematic statis- 
tics,” Biometrika, 36, 92-100. K [60-673] 
(1949b): “Some low moments of order statistics,” Annals of Mathe- 
matieal Statistics, 20, 279-85. LP [49-722] 
Gold, E. (1929): “Note on the frequency of occurrences of events in series of two 
types,” Quarterly Journal of the Royal Meteorological Society, 55, 307. 1NO 
Gontcharoff, W. (1942): “Sur la distribution des cycles dans les permutations,” 
Comptes Rendus (Doklady) de l’ Académie des Sciences URSS (N.S.), 35, 
267-69. N [43-102] 
(1943): “Sur la succession des événements dans une série d’épreuves in- 
dépendantes répondant au schéme de Bernoulli,” Comptes Rendus ( Doklady) 




















de l’ Académie des Sciences URSS (N.S.), 38, 283-85. N [44-124] 
(1944): “Du domaine de l’analyse combinatoire,” Izvestia Akademii 
Nauk SSSR, 8, 3-48; (Russian, French summary). N [45-88] 


Good, I. J. (1953): “The serial test for sampling numbers and other tests for 
randomness,” Proceedings of the Cambridge Philosophical Society, 4%, 276-84. 


IN 
Goodman, Leo A. (1952): “Serial number analysis,” Journal of the American 
Statistical Association, 47, 623-34. IN 





(1953): “Parameter-free and nonparametric tolerance limits: The ex- 
ponential case,” Annals of Mathematical Staiistics, 24, 139-40; abstract. D 
Goodman, Leo A., and Kruskal, W. H. (1953): “Measures of association for 
cross-classifications,” Annals of Mathematical Statistics, 24, 147; abstract. 

H 

Gordon, Mordecai H., et al. (1952): “An extended table of chi-square for two 
degrees of freedom for use in combining probabilities from independent 
samples,” Psychometrika, 17, 311-16. PX 
Goudsmit, S. (1945): “Random distribution of lines in a plane,” Reviews of 
Modern Physics, 17, 321-22. N [46-309] 








63; 
60] 
ney 


570] 
the 


",” 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 867 


Grant, Alison M. (1952): “Some properties of runs in smoothed random series,” 


Biometrika, 39, 198-204. INP (62-963) 
Greenwood, J. A. (1938): “Variance of a general matching problem,” Annals of 
Mathematical Statistics, 9, 56-59. N 
—— (1940): “The first four moments of a general matching problem,” Annals 
of Eugenics, 10, 290-92. N [41-228] 
—— (1943): “A preferential matching problem,” Psychometrika, 8, 185-91. 
M [44-127] 


Greenwood, Major (1946): “The statistical study of infectious diseases,” 
Journal of the Royal Statistical Society, 109, 85-103; discussion, 103-10. 

IN [47-691] 

Greenwood, M. L., and Salerno, R. (1949): “Palatability of kale in relation to 

cooking procedure and variety,” Food Research, 14, 314-19. MO 

Greenwood, Robert E. (1953): “Probabilities of certain solitaire card games,” 

Journal of the American Statistical Association, 48, 88-93. NP 


/ Greville, T. N. E. (1938): “Exact probabilities for the matching hypothesis,” 


Journal of Parapsychology, 2, 55-59. N 
(1941): “The frequency distribution of a general matching problem,” 
Annals of Mathematical Statistics, 12, 350-54. MN [42-171] 
(1944): “On multiple matching with one variable deck,” Annals of 
Mathematical Statistics, 15, 432-34. N [45-232] 








, Griffith, A. A. (1920): “The phenomena of rupture and flow in solids,” Philo- 


sophical Transactions of the Royal Society, 221A, 163. LO 
Grossnickle, Louise T. (1942): “The scaling of test scores by the method of 
paired comparisons,” Psychometrika, 7, 43-64. M 
Griineberg, Hans, and Haldane, J. B. S. (1937): “Tests of goodness of fit applied 
to records of Mendelian segregation in mice,” Biometrika, 29, 144-53. EO 
Guilford, J. P. (1941): “The phi coefficient and chi square as indices of item 
validity,” Psychometrika, 6, 11-19. JP 
Guldberg, Alf (1922): “Sur un théoréme de M. Markoff,” Comptes Rendus 
(Paris), 175, 679-80. Cc 
Gumbel, E. J. (1935b): “Les valeurs extrémes des distributions statistiques,” 
Annales del’ Institut H. Poincaré, 5, 115-58. LNP 
(1942): “Simple tests for given hypotheses,” Biometrika, 32, 317-33. 
E [43-26] 
(1943): “On the reliability of the classical chi-square test,” Annals of 
Mathematical Statistics, 14, 253-63. E [45-9] 
(1944): “Ranges and midranges,” Annals of Mathematical Statistics, 15, 
414-22. LN [45-162] 
(1946): “On the independence of the extremes in a sample,” Annals of 
Mathematical Statistics, 17, 78-81. LN [46-464] 
(1947): “The distribution of the range,” Annals of Mathematical Sta- 
tistics, 18, 384-412. LNP [48-196] 
(1949): “Probability tables for the range,” Biometrika, 36, 142-48. 
LNP [50-527] 
Gumbel, E. J., and Greenwood, J. A. (1951): “Table of the asymptotic distri- 


bution of the second extreme,” Annals of Mathematical Statistics, 22, 121-24. 
LNP [61-621] 





























868 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


Gumbel, E. J., and Keeney, R. D. (1950a): “The geometric range for distribu- 
tions of Cauchy’s type,” Annals of Mathematical Statistics, 21, 133-37. 





LNP [50-446] 
(1950b): “The extremal quotient,” Annals of Mathematical Statistics, 
21, 523-38. LNP (61-428) 


Gumbel, E. J., and von Schelling, H. (1950): “The distribution of the number of 
exceedances,” Annals of Mathematical Statistics, 21, 247-62. DNP (50-732) 
Gupta, Hansraj (1950): “Tables of distributions,” Research Bulletin of the East 
Panjab University, 1950, 13-44. NP [61-760] 
Guttman, Louis (1946): “An approach for quantifying paired comparisons and 
rank order,” Annals of Mathematical Statistics, 17, 144-63. MO [47-41] 
(1948a): “An inequality for kurtosis,” Annals of Mathematical Statistics, 








19, 277-78. CX [48-599] 

(1948b): “A distribution-free confidence interval for the mean,” Annals 

of Mathematical Statistics, 19, 410-13. CG [49-136] 
H 


Haden, H. G. (1947): “A note on the distribution of the different orderings of n 
objects,” Proceedings of the Cambridge Philosophical Society, 43, 1-9. 


N [47-160] 
Hadwiger, H. (1943): “Uber gleichwahrscheinliche Aufteilungen,” Zeitschrift 
fur angewandte Mathematik und Mechanik, 22, 226-32. N [44-124] 





(1946): “Eine Bemerkung iiber zufallige Anordnungen der natiirlichen 
Zahlen,” Mitteilungen der Vereinigung Schweizerischer Verischerungsmathe- 


_matiker, 46, 105-09. N [46-406] 
Hald, A. (1952): Statistical theory with engineering applications, John Wiley and 
Sons, Inc., New York; chapters 12 and 13. ILO (58-188) 


Hald, A., and Sinkbaek, S. A. (1950): “A table of percentage points of the x? 
distribution,” Skandinavisk Aktuarietidskrift, 33, 168-75. EP [61-621] 
Haldane, J. B. S. (1937): “The exact value of the moments of the distribution of 
x?, used as a test of goodness of fit, when expectations are small,” Bio- 





metrika, 29, 133-43. EN 
(1939): “The mean and variance of x, when used as a test of homo- 
geneity, when expectations are small,” Biometrika, 31, 346-55. HN 





(1940): “The mean and variance of x?, when used as a test of homo- 
geneity, when expectations are small,” Biometrika, 31, 346-55. 








HN [40-346] 

(1949): “Some statistical problems arising in genetics,” Journal of the 

Royal Statistical Society (B), 11, 1-9; discussion, 9-14. A [61-431] 
Haldane, J. B. S., and Smith, Cedric A. B. (1948): “A simple exact test for birth- 
order effect,” Annals of Eugenics, 14, 117-24. INOP 
Halmos, Paul R. (1946): “The theory of unbiased estimation,” Annals of 
Mathematical Statistics, 17, 34—43. BG [46-463] 
(1950): Measure theory, D. Van Nostrand Company, Inc., New York, 
196-200. C [50-504] 


Hannan, James F. (1950): “Some tests based on the empirical distribution func- 
tion. (Preliminary Report),” Annals of Mathematical Statistics, 21, 312-13; 
abstract. EN 

Harris, Lee B. (1952): “On a limiting case for the distribution of exceedances, 








0) 
ft 
4] 
n 
~~ 


3] 


] 
f 


_ == F+- = 








t 


BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 869 


with an application to life-testing,” Annals of Mathematical Statistics, 23, 


295-98. LNP (62-956) 
Hartley, H. O. (1942): “The range in random samples,” Biometrika, 32, 334-48. 
LN [43-21] 

—— (1950a): “The use of range in analysis of variance,” Biometrika, 37, 271- 
80. GLNP [61-62] 
——- (1950b): “The maximum F-ratio as a short-cut test for heterogeneity of 
variance,” Biometrika, 37, 308-12. GLNP [61-3 46} 


Hartley, H. O., and Pearson, E. S. (1951): “Moment constants for the distribu- 
tion of range in normal samples,” Biometrika, 38, 463-64. LP [2-664] 
Hastings, Jr., Cecil, et al. (1947): “Low moments for small samples: A compara- 
tive study of order statistics,” Annals of Mathematical Statistics, 18, 413-26. 
KLNP [48-196] 
Hayashi, Chikio (1950): “Fragments of a new test formula of normality,” 
Annals of the Institute of Statistical Mathematics, Tokyo, 1, 125-30. 
GLN [61-36] 
Hemelrijk, J. (1949a): “Construction of a confidence region for a line,” Pro- 
ceedings Koninklijke Nederlandse Akademie van Wetenschappen, 52, 995- 
1005 = Indagationes Mathematicae, 11, 374-84. J (60-529) 
(1949b): Construction of a confidence region for and estimation of the 
coefficients of a line from a number of points which have been observed with 
error in one or both directions, Mimeographed report S 28 from the Mathe- 
matical Centre, Amsterdam (Dutch). DJ 
(1950a): “A family of parameterfree tests for symmetry with respect to a 
given point. I,” Proceedings Koninklijke Nederlandse Akademie van Weten- 
schappen, 53, 945-55 = Indagationes Mathematicae, 12, 340-50. 
EF [61-37] 
(1950b): “A family of parameterfree tests for symmetry with respect to a 
given point. II,” Proceedings Koninklijke Nederlandse Akademie van Weten- 
schappen, 53, 1186-98 =Indagationes Mathematicae, 12, 419-31. 
EFG (61-429) 
(1950c): Symmetrietoetsen en andere toepassingen van de theorie van 
Neyman en Pearson. (Symmetry tests and other applications of the theory of 
Neyman and Pearson), Excelsiors Foto-Offset, ’s-Gravenhage, 91+4 pp. 
EFG (61-511) 
(1950d): A symmetry test, Mathematisch Centrum Amsterdam, Rapport 
ZW-1950-015, 9 pages. (Dutch.) EFG [61-622] 
(1950e): “Rangcorrelatie en de schattingsproef van Varangot,” (“Rank 
correlation methods applied to an experiment on estimation of Varangot”), 
Statistica, Rijswijk, 4, 216-25; (English summary.) J [61-429] 
(1952a): “Parametrische en parametervrije methoden en hun toe- 
passingen,” (“Parametric and parameterfree methods and their applica- 
tions”), Statistica, Rijswijk, 5, 171-84; (Dutch, English summary). A 
(1952b): “A Theorem on the sign test when ties are present,” Proceedings 
Koninklijke Nederlandse Akademie van Wetenschappen (A), 55,= Indaga- 





























tiones Mathematicae 14, 322-26. G [62-962] 
(1952c): “Note on Wilcoxon’s two-sample test when ties are present,” 
Annals of Mathematical Statistics, 23, 133-35. GN [628-762] 


Hemelrijk, J., et al. (1951): Cursus “parametervrije methoden,” 
Mathematical Centre, Amsterdam. ADFGJP 











870 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


Herdan, G. (1949): “Use of statistical inequalities in polymer research,” Re- 





search, 2, 235. Co 
(1950): “Use of statistical inequalities in polymer research,” Research, 3, 
35 Co 





(1953): Small particle statistics: An account of statistical methods for the 
investigation of finely divided materials, Elsevier Publishing Co., New York, 


1-520. Cx 
van der Heiden, J. A. (1952): “Ona correction term in the method of paired com- 
parisons,” Biometrika, 39, 211-12. MN 


Hirschmann, Albert O. (1943): “On measures of dispersion for a finite distribu- 
tion,” Journal of the American Statistical Association, 38, 346-52. 

K [44-42] 

Hodges, J. L., and Lehmann, E. L. (1950): “Some problems in minimax point 

estimation,” Annals of Mathematical Statistics, 21, 182-97. BG [51-36] 

Hoeffding, Wassily (1940): “Maszstabinvariante Korrelationstheorie,” Schriften 

des Mathematischen Instituts und des Institute fir angewandte Mathematik 

der Universitdt Berlin, 5, Part 3, 179-233. J [42-5] 

(1941): “Maszstabinvariante Korrelationsmasse fiir diskontinuierliche 

Verteilungen,” Archiv fur mathematische Wirtschafts und Sozialforschung, 7, 








49-70. J 
(1942): “Stochistische Abhangigkeit und funktionaler Zusammenhang,” 
Skandinavisk Aktuarietidskrift, 25, 200-27. J [46-212] 





(1947): “On the distribution of the rank correlation coefficient r when 
the variates are not independent,” Biometrika, 34, 183-96. JN [48-364] 
(1948a): “A class of statistics with asymptotically normal distribution,” 
Annals uf Mathematical Statistics, 19, 293-325. ABN [49-13 4] 
(1948b): “A non-parametric test of independence,” Annals of Mathe- 

matical Statistics, 19, 546-57. JNP [49-654] 
——— (1951la): “A combinatorial central limit theorem,” Annals of Mathe- 

matical Statistics, 22, 558-66. N [62-363] 
(1951b): “ ‘Optimum’ nonparametric tests,” Proceedings of the Second 
Berkeley Symposium on Mathematical Statistics and Probability, University 




















of California Press, 83-92. B 
(1952a): “The large-sample power of tests based on permufations of ob- 
servations,” Annals of Mathematical Statistics, 23, 169-92. AB 
(1952b): “Some powerful rank order tests,” Annals of Mathematical 
Statistics, 23, 303; abstract. BJ 
(1953): “On the distribution of the expected values of the order statis- 
tics,” Annals of Mathematical Statistics, 24, 93-100. LN 


Hoeffding, Wassily, and Robbins, Herbert (1948): “The central limit theorem 
for dependent random variables,” Duke Mathematical Journal, 15, 773-80. 


N [49-200] 
Hoel, Paul G. (1938): “On the chi-square distribution for small samples,” 
Annals of Mathematical Statistics, 9, 158-65. EN 


Hojo, Tokishige (1932): “Distribution of the median, quartiles and interquartile 
distance in samples from a normal population,” Biometrika, 23, 315-60. LN 
(1933): “A further note on the relation between the median and the 
quartiles in small samples from a normal population,” Biometrika, 25, 79-90. 
LP 











se 
“— 


ben 
— 


=~ SSeS ee 











BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 871 . 


Homma, Tsuruchiyo (1951): “On the asymptotic independence of order statis- 
tics,” Reports of Statistical Application Research Union of Japanese Scien- 
tists and Engineers, 1, 1-8. L [62-64] 

Horn, D. (1942): “A correction for the effect of tied ranks on the value of the 
rank difference correlation coefficient,” Journal of Educational Psychology, 





33, 686. J 
Hornich, Hans (1941): “Zur Theorie des Risikos,” Monatshefte fir Mathematik 
und_ Physik, 50, 142-50. X [45-4] 
Hotelling, Harold (1940): “The teaching of statistics,” Annals of Mathematical 
Statistics, 11, 457-70; see 460-61. J 
(1947): “Effects of non-normality at high significance levels,” Annals of 
Mathematical Statistics, 18, 608-09; abstract. K 


Hotelling, Harold, and Pabst, Margaret Richards (1936): “Rank correlation 
and tests of significance involving no assumption of normality,” Annals of 


Mathematical Statistics, 7, 29-43. JN 
Housner, G. W., and Brennan, J. F. (1948): “The estimation of linear trends,” 
Annals of Mathematical Statistics, 19, 380-93. IN [49-201] 


Hsu, P. L. (1945): “The approximate distributions of the mean and variance of a 
sample of independent variables,” Annals of Mathematical Statistics, 16, 


1-29. N [45-233] 
Huntington, E. V. (1937): “A rating table for card matching experiments,” 
Journal of Parapsychology, 4, 292-94. NP 

I 
Irick, Paul (1952): “Sampling distributions for dispersion statistics,” Bio- 
metrics, 8, 93-94; abstract. LN 


Irwin, J. O. (1935): “Tests of significance for differences between percentages 
based on small numbers,” Metron, 12, 83-94. 

(1949): “A note on the subdivision of x? into components,” Biometrika, 

36, 130-34. H [50-528} 





. Irwin, M. R., and Snedecor, George W. (1933): “On the chi-square for homo- 


geneity,” Jowa State College Journal of Science, 8, 75-81. H 
Ising, Ernest (1925): “Beitrag zur Theorie des Ferromagnetismus,” Zeitschrift 
fir Physik, 31, 253-58. N 


J 


Janko, Jaroslav (1950): “Advances in the theory of non-parametric tests in 
statistical inference,” Casopis pro Péstovant Matematiky a Fysiky, 74, 62-74; 


(Czech, English summary). A [61-429] 
Jeeves, T. A., and Richards, Robert (1950): “A note on the power of the sign 
test,” Annals of Mathematical Statistics, 21, 618; abstract. G 
Jeffreys, Harold (1937): “The tests for sampling differences and contingency,” 
Proceedings of the Royal Society of London (A), 162, 479-95. H 
Johnson, N. L. (1948): “Tests of significance in the variate difference method,” 
Biometrika, 35, 206-09. IN [49-61] 
Johnson, Palmer O. (1950): “The quantification of data in discriminant analy- 
sis,” Journal of the American Siatistical Association, 45, 65-76. M 


Jones, A. E. (1946): “A useful method for the routine estimation of dispersion 
from large samples,” Biometrika, 33, 274-82. LP [47-42] 











872 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 193; 


Jones, Herbert E. (1937): “The theory of runs as applied to time series,” Report 
of the Third Annual Research Conference on Economics and Statistics, 
Cowles Commission, 33-36. 10 

Jones, Howard L. (1948): “Exact lower moments of order statistics in smal] 
samples from a normal distribution,” Annals of Mathematical Statistics, 19, 





270-73. LP [48-601} 
(1953): “Approximating the mode from weighted sample values,” Jour. 
nal of the American Statistical Association, 48, 113-27. KP 


de Jongh, B. H. (1941): “General minimum-probability theorem,” Koninklijke 
Nederlandse Akademie van Wetenschappen Proceedings, 44, 738-43; (Dutch, 


German summary). C [42-168] 
Judd, Deane B. (1936): “A method for determining whiteness of paper, II,” 
Paper Trade Journal, August 20, 1-7. JMO 


Juncosa, M. L. (1949): “The asymptotic behavior of the minimum in a sequence 
of random variables,” Duke Mathematical Journal, 16, 609-18. IN [60-375] 
Jurgensen, C. E. (1947): “Table for determining phi coefficients,” Psycho- 
metrika, 12, 17-29. HP (47-477) 


K 


Kaarsemaker, L., and van Wijngaarden, A. (1952): Tables for use in rank cor- 
relation, Mimeographed report No. R73 from the Computation Department 
of the Mathematical Centre, Amsterdam. JP (62-664) 
Kac, M. (1949): “On deviations between theoretical and empirical distributions,” 
Proceedings of the National Academy of Sciences ( U.S.A.), 35, 252-57. 
EN [49-61 4] 
Kanellos, S. G. (1948): ‘Two problems of calculus of probability,” Bulletin de 
la Société Mathématique de Gréce, 23, 132-42; (Greek, English summary). 
N [48-518] 
Kaplan, E. L. (1949): Distribution generated by the randomization of a fixed 
sample. Preliminary report, Mimeographed Report No. 44, Statistical Re- 
search Group, Princeton University. N 
Kaplansky, Irving (1945a): “The asymptotic distribution of runs of consecutive 
elements,” Annals of Mathematical Statistics, 16, 200-03. NP [46-208] 
(1945b): “A common error concerning kurtosis,” Journal of the American 
Statistical Association, 40, 259. X [46-20] 
Kaplansky, Irving, and Riordan, John (1945): “Multiple matching and runs by 
the symbolic method,” Annals of Mathematical Statistics, 16, 272-77. 
N [46-309] 
(1946): “The problem of the rooks and its applications,” Duke Mathe- 
matical Journal, 13, 259-68. N [46-508] 
Katz, Leo (1952): “The distribution of the number of isolates in a social group,” 
Annals of Mathematical Statistics, 23, 271-76. NP [62-69] 
Kawata, Tatsuo (1951): “Limit distributions of single order statistics,” Reports 
of Statistical Application Research Union of Japanese Scientists and En- 
gineers, 1, 4—9. LN [62-142] 
Keen, Joan, Page, Denys F., and Hartley, Herman O. (1953): “Estimating 
variability from the differences between successive readings (with appendix: 
Some theoretical aspects),” Applied Statistics, 2, 13-23. 
Keeping, E. S. (1952): “The problem of birth ranks,” Biometrics, 8, 112-19. 
INOP 

















Dort 
‘ice, 


nall 
19, 
01] 
ur. 
KP 
jke 
ch, 
68] 


£0 
1ce 
76] 
10- 
r7| 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 873 


Kemperman, J. H. B. (1950): De verdelingsfunctie van het aantal inversies in de 
test van Mann en Whitney. (The distribution function of the number of in- 
versions in Mann and Whitney’s test), Mimeographed report T.W. 7 from 


the Mathematical Centre, Amsterdam. GN 
Kempthorne, Oscar (1952): The design and analysis of experiments, John Wiley 
and Sons, Inc., New York, 1-631, chapters 7 and 8. G [62-572] 
Kendall, M. G. (1938): “A new measure of rank correlation,” Biometrika, 30, 
81-93. JNP 
__—— (1940): “Note on the distribution of quantiles for large samples,” Journal 
of the Royal Statistical Society (B), 7, 83-85. LN [41-23] 
__——. (1942a): “Note on the estimation of a ranking,” Journal of the Royal 
Statistical Society (B), 105, 119-21. MN [48-107 


(1942b): “Partial rank correlation,” Biometrika, 32, 277-83. JN [43-22] 
——- (1945): “The treatment of ties in ranking problems,” Biometrika, 33, 














239-51. JN [47-41] 
(1947): “The variance of r when both rankings contain ties,” Bio- 
metrika, 34; 297-98. JN [48-458] 
—— (1948a): Rank correlation methods, Charles Griffin and Co., Ltd., London. 
AJMNP 

(1948b): The advanced theory of statistics, Charles Griffin and Co., Ltd., 
London; Chapters 12, 13, and 16. HJ 
(1949): “Rank and product-moment correlation,” Biometrika, 36, 177-93. 

JN [60-673] 


Kendall, M. G., Kendall, Sheila F. H., and Smith, B. Babington (1938): “The 
distribution of Spearman’s coefficient of rank correlation in a universe in 
which all rankings occur an equal number of times,” Biometrika, 30, 251-73. 

JNP 

Kendall, M. G., and Smith, B. Babington (1938): “Randomness and random 
sampling numbers,” Journal of the Royal Statistical Society, 101 147-66. JO 

(1939a): Tables of random sampling numbers, Tracts for Computers No. 














24, Cambridge University Press, London N.W. 1. IP 
(1939b): “Second paper on random sampling numbers,” Journal of the 
Royal Statistical Society (B), 6, 51-61. IO 
(1939c): “The problem of m rankings,” Annals of Mathematical Statistics, 
10, 275-87 MNP [40-23] 
(1939d): “On the method of paired comparisons,” Biometrika, 31, 324-45. 
MNP(41-111] 


Kerawala, S. M. (1948): “On bounds of skewness and kurtosis,” Bulletin of the 


Calcutta Mathematical Society, 40, 41-44. X [49-13 4] 
Kermack, W. O., and McKendrick, A. G. (1937a): “Tests for randomness in a 
series of numerical observations,” Proceedings of the Royal Society of Edin- 
burgh, 57, 228-40. INO 
(1937b): “Some distributions associated with a randomly arranged set of 
numbers,” Proceedings of the Royal Society of Edinburgh, 57, 332-76. IN 
(1938): “Some properties of points arranged at random on a Mobius 
surface,” Mathematical Gazette, 22, 66-72. N 
(1940): “The design and interpretation of experiments based on a four- 
fold table: the statistical assessment of the effect of treatment,” Proceed- 
ings of the Royal Society of Edinburgh, 60, 362-75. H [41-236] 
Khamis, Salem H. (1950): “A note on the general Chebycheff inequality,” 























874 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


Proceedings of the International Congress of Mathematicians, 1, 569-70, 
published by the American Mathematical Society, 1952. Cc 
(1952): “On the reduced moment problem,” Annals of Mathematica] 
Statistics, 23, 642; abstract. Cc 
Kimball, Bradford F. (1947): “Some basic theorems for developing tests of fit for 
the case of the non-parametric probability distribution function, I,” 
Annals of Mathematical Statistics, 18, 540-48. EN [48-225] 
(1950): “On the asymptotic distribution of the sum of powers of unit 
frequency differences,” Annals of Mathematical Statistics, 21, 263-71. 
N (60-673) 
Kolmogorov, A. N. (1928): “Uber die Summen durch den Zufall bestimmter un- 
abhangiger Gréssen,” Mathematische Annalen, 99, 309-19. Cc 
(1933a): “Ueber die Grenzwertsitze der Wahrscheinlichkeitsrechnung,’ 
Bulletin Acad., Sc., U.R.S.S., 263-72. C 
(1933b): “Sulla determinazione empirica delle leggi di probabilité,” 
Giornale dell’ Institute Italiano degli Attuari, 4, 1-11. EN 
(1941): “Confidence limits for an unknown distribution function,” 
Annals of Mathematical Statistics, 12, 461-63. EFNP [43-25] 
(1950): Foundations of the theory of probability, Chelsea, New York, 42-43. 
(Translated by Nathan Morrison.) C [60-37 4] 
Kolmogorov, A. N., and Hin¢in, A. Ya. (1951): “The work of N. V. Smirnov on 
the investigation of properties of variational series and on nonparametric 
problems of mathematical statistics,” Uspehi Matematiteskih Nauk (N.S.), 
6, No. 4(44) 190-92; (Russian). A [52-198] 
Kondo, Tsutomu (1929): “On the standard error of the mean square contin- 
gency,” Biometrika, 21, 376-428. H 
Krishna Iyer, P. V. “Random association of points on a lattice,” Nature, 160, 
714. N [48-193] 
(1948a): “The theory of probability distributions of points on a line,” 
Journal of the Indian Society of Agricultural Statistics, 1, 173-95. 
N [50-446] 
(1948b): “Random association of points on a lattice,” Nature, 162, 333. 
IN 
(1949a): “Calculation of factorial moments of certain probability dis- 
tributions,” Nature, 164, 282. N 
(1949b): “The first and second moments of some probability distribu- 
tions arising from points on a lattice and their applications,” Biometrika, 
36, 135-41. N [60-607] 
(1950a): “The theory of probability distributions of points on a lattice,” 
Annals of Mathematical Statistics, 21, 198-217. NP [50-732] 
(1950b): “Runs up and down on a lattice,” Nature, 166, 276. 
N [61-8 41] 
(1950c): “Further contributions to the theory of probability distribu- 
tions of points on a line. I,” Journal of the Indian Society of Agricultural 















































Statistics, 2, 141-60. N [61-271] 
(1951a): “A non-parametric method of testing k samples,” Nature, 167, 
33. F 





(1951b): “Further contributions to the theory of probability distribu- 
tions of points on a line II,” Journal of the Indian Society of Agricultural 
Statistics, 3, 80-93. N [62-142] 

















BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 875 


__— (1952a): “Further contributions to the theory of probability distribu- 
tions of points on a line, III,” Journal of the Indian Society of Agricultural 
Statistics, 4, 50-71. FIN [53-297] 

——— (1952b): “Factorial moments and cummulants of distributions arising in 
Markoff chains,” Journal of the Indian Society of Agricultural Statistics, 4, 
113-23. N 


, Krishna Iyer, P. V., and Sukhatme, B. V. (1949): “Probability distribution of 


points on a line,” Science and Culture, 15, 200. N [62-476] 
Kruskal, W. H. (1952): “A nonparametric test for the several sample problems,” 
Annals of Mathematical Statistics, 23, 525-40. GN [53-391] 
Kruskal, W. H., and Wallis, W. A. (1952): “Use of ranks on one-criterion vari- 
ance analysis,” Journal of the American Statistical Association, 47, 583-621. 





GP 
Kullback, Solomon (1939): “Note on a matching problem,” Annals of Mathe- 
matical Statistics, 10, 77-80. N 


Kunisawa, K., Makabe, H., and Morimura, H. (1951): “Tables of confidence 
bands for the population distribution function. I,” Reports of Statistical 
Application Research Union of Japanese Scientists and Engineers, 1, 23-44. 


FP [52-569] 
Kuznets, Simon (1929): “Random events and cyclical oscillations,” Journal of 
the American Statistical Association, 24, 258-75. IN 


Kvit, I. D. (1950): “On N. V. Smirnov’s theorem concerning the comparison of 
two samples,” Doklady Akademii Nauk SSSR (N.S.), 71, 229-31; (Rus- 
sian). F [50-5 28] 


L 


Laderman, Jack (1939): “The distribution of ‘Student’s’ ratio for samples of two 
items drawn from non-normal universes,” Annals of Mathematical Statistics, 
10, 376-79. K [40-153] 
Lal, D. N. (1952): “On the test of a hypothesis concerning two independent fre- 
quency distributions,” Journal of the Indian Society of Agricultural Sta- 
tistics, 4, 72-84. F [53-298] 
Lancaster, H. O. (1949): “The derivation and partition of x* in certain discrete 
distributions,” Biometrika, 36, 117-29. H [51-191] 
(1950): “The exact partition of x? and its application to the problem of 
the pooling of small expectations,” Biometrika, 37, 267-70. E 
(1951): “Complex contingency tables treated by the partition of x?,” 
Journal of the Royal Statistical Society (B), 13, 242-49. H [53-486] 
Lehmann, E. L. (1951): “Consistency and unbiasedness of certain nonparametric 
tests,” Annals of Mathematical Statistics, 22, 165-79. ABF [61-126] 
(1953): “The power of rank tests,” Annals of Mathematical Statistics, 
24, 23-42. BGJ 
Lehmann, E. L., and Stein, C. (1949): “On the theory of some non-parametric 
hypotheses,” Annals of Mathematical Statistics, 20, 28-45. B [49-723] 
Leslie, P. H. (1951): “The calculation of x? for an rXc contingency table,” 











Biometrics, 7, 283-86. H [62-479] 
Lesser, C. E. V. (1942): “Inequalities for multivariate frequency distributions,” 
Biometrika, 32, 284-93. CP [43-16] 


Lesser, Pamela C. V. (1933): “Note on the shrinkage of physical characters in 











876 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


man and woman with age, as an illustration of the use of x’, P methods,” 
Biometrika, 25, 197-202. EO 
Levene, Howard (1946a): “A test of randomness in two dimensions,” Annals of 
Mathematical Statistics, 17, 500; abstract. I 
(1946b): “A test of randomness in two dimensions,” Bulletin of the 
American Mathematical Society, 52, 621; abstract. IN 
(1952): “On the power function of tests of randomness based on runs up 
and down,” Annals of Mathematical Statistics, 23, 34-56. BINP [52-762] 
Levene, H., and Wolfowitz, J. (1944): “The covariance matrix of runs up and 
down,” Annals of Mathematical Statistics, 15, 58-69. IN [44-208] 
Lévy, Paul (1936): “La loi forte des grands nombres pour les variables aléatoires 
enchdinées,” Journal de Mathématiques pures et appliquées, 101, 11-24. CN 
Lewis, Don, and Burke, C. J. (1949): “The use and misuse of the chi-square test,” 














Psychological Bulletin, 46, 433-89. EHO 
(1950): “Further discussion of the use and misuse of the chi-square test,” 
Psychological Bulletin, 47, 347-55. E 
Lindeberg, J. W. (1925): “Uber die Korrelation,” Matematiker Kongressen | 
Kobenhain, 31 August-4 September 1925, 437-46. J 
(1929): “Some remarks on the mean error of the percentage of correla- 
tion,” Nordisk Statistik Tidskrift, 8, 138-42. J 
Lloyd, E. H. (1952): “Least-squares estim. ‘ion of location and scale parameters 
using order statistics,” Biometrika, 39, 88-95. G [53-65] 


Lollar, Robert M. (1952): “Statistical methods applied to the analysis and testing 

of leather,” Journal of the American Leather Chemists Association, 48, 60-84. 

0 

Loéve, Michel (1945): “Etude asymptotique des sommes de variables aléatoires 
liées,” Journal de Mathématiques pures et appliquées, 109, 249-318. 





CN [46-156] 
(1950): “Fundamental limit theorems of probability theory,” Annals of 
Mathematical Statistics, 21, 321-38. N 


Lombard, Herbert L., and Doering, Carl R. (1947): “Treatment of the fourfold 
table by partial association and partial correlation as it relates to public 


health problems,” Biometrics, 3, 123-28. HO 
Lord, E. (1947): “The use of range in place of standard deviation in the t-test,” 
Biometrika, 34, 41-67. LP [47-394] 


Lowry, Dorothy Cruden (1951): “A probabilistic study of runs in egg production. 
(Preliminary Report),” Annals of Mathematical Statistics, 22, 485; abstract. 

I0 

Lurquin, Constant (1922): “Sur le critérium de Tchebycheff,” Comptes Rendus 
(Paris), 175, 681-83. e 
Lyerly, Samuel B. (1952): “The average Spearman rank correlation coefficient,” 
Psychometrika, 17, 421-28. JNP 


Mack, C. (1948): “An exact formula for Q,(n), the probable number of k-aggre- 
gates in a random distribution of n points,” Philosophical Magazine (7), 39 
778-90. N [49-310] 

MacMahon, Percy (1915): Combinatory analysis, Vol. I, Cambridge University 
Press. N 











R 1953 


ods,” 


als of 


f the 

IN 
18 up 
-76 2] 
. and 
-208] 
Oires 

CN 
est,” 
vHO 
est,” 


on I 
rela- 


ters 
3-65] 
ting 
—84, 


ires 


156] 
Is of 


fold 
blic 


st,” 
39 4] 
ion. 
act. 

IO 
dus 


ot,” 
NP 
rre- 


1 0) 
sity 








pIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 877 


__——. (1916): Combinatory analysis, Vol. II, Cambridge University Press. N 
Madow, William G. (1948): “On the limiting distributions of estimates based 
on samples from finite universes,” Annals of Mathematical Statistics, 19, 


535-45. BN [60-564] 
Mahalanobis, P. C. (1944): “On large-scale sample surveys,” Philosophical 
Transactions of the Royal Society of London B, 231, 329-451. IN 
Mahalanobis, P. C. et al. (1933): “Tables of random samples from a normal 
population,” Sankhyd, 1, 289-328. P 


Mainland, Donald (1948): “Statistical methods in medical research. I. Qualita- 
tive statistics (enumeration data),” Canadian Journal of Research, 26, 1-166. 


HOP 

__— (1952): Elementary medical statistics, W. B. Saunders Co., Philadelphia; 
chapter IV. HP 
Mainland, Donald, and Murray, I. M. (1952): “Tables for use in fourfold con- 
tingency tests,” Science, 116, 591-94. HP [53-389} 
Malmquist, S. (1950): “On a property of order statistics from a rectangular dis- 
tribution,” Skandinavisk Aktuarietidskrift, 33, 214-22. EN [61-726] 


Maniya, G. M. (1949): “Generalizations of the criterion of A. N. Kolmogorov 
for an estimate for the law of distribution for empirical data,” Doklady 
Akademiya Nauk SSSR (N.S.), 69, 495-97; (Russian). EN [60-261] 

Mann, H. B. (1945a): “Nonparametric tests against trend,” Econometrica, 13, 
245-59. IN [46-21] 

——— (1945b): “On a test for randomness based on signs of differences,” Annals 
of Mathematical Statistics, 16, 193-99. IN [46-22] 

——— (1950): “Nonparametric tests against trend,” Statistical inference in 
dynamic economic models, chapter 12, John Wiley and Sons, Inc., New York. 

IP 

Mann, H. B., and Wald, A. (1942): “On the choice of the number of class intervals 
in the application of the chi square test,” Annals of Mathematical Statistics, 
13, 306-17. E [48-106] 

Mann, H. B., and Whitney, D. R. (1947): “On a test of whether one of two ran- 
dom variables is stochastically larger than the other,” Annals of Mathe- 
matical Statistics, 18, 50-60. GNP [48-161] 

Marbe, K. (1926): Mathematische Beuerkongen, C. H. Beck, Miinchen and Ber- 
lin, 8-9. IN 

(1934): Grundfragen der angewandten Wahrscheinlichkeitsrechnung und 
theoretischer Statistik, C. H. Beck, Miinchen and Berlin, 26. IN 

Markoff, A. (1884a): On certain applications of algebraic continued fractions, 
Thesis, St. Petersburg, (Russian). Cc 

(1884b): “Démonstration de certaines inégalités de M. Tchébychef,” 

Mathematische Annalen, 24, 172-80. C 

(1924): Calculus of probability, Soviet Press, Moscow, 4th edition, 588 
pp.; (Russian). C 

Marshall, Andrew W. (1951): “A large-sample test of the hypothesis that one of 
two random variables is stochastically larger than the other,” Journal of 
the American Statistical Association, 46, 366-74. FGN [62-260] 











/ Marshall, Andrew W., and Walsh, John E. (1950): “Some tests for comparing 


percentage points of two arbitrary continuous populations,” Proceedings of 
the International Congress of Mathematicians, 1, 582-83; published by the 
American Mathematical Society, 1952. G 








878 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Massey, Frank J., Jr. (1950a): “A note on the power of a non-parametric test,” 
Annals of Mathematical Statistics, 21, 440-43. E (61-117) 
(1950b): “A note on the estimation of a distribution function by conf- 

dence limits,” Annals of Mathematical Statistics, 21, 116-19. ENP [50-446] 
(1951a): “The Kolmogorov-Smirnov test for goodness of fit,” Journal 

of the American Statistical Association, 46, 68-78. EFpP 
(1951b): “The distribution of the maximum deviation between two sam- 

ple cumulative step functions,” Annals of Mathematical Statistics, 22, 125- 

28. FNP [51-621] 
(1951c): “A note on a two sample test,” Annals of Mathematical Sta- 

tistics, 22, 304-06. FN [61-841] 
(1952a): “Nonparametric comparisons of populations when data are 
collected in homogeneous groups,” Annals of Mathematical Statistics, 23, 
642; abstract. G 
(1952b): “Distribution table for the deviation between two sample 
cumulatives,” Annals of Mathematical Statistics, 23, 435-41. FP [52-63] 
(1952c): “Correction to ‘a note on the power of a nonparametric test’, ” 
Annals of Mathematical Statistics, 23, 637-38. E [58-391] 
(1952d): “On the analysis of data matched in pairs,” Annals of Mathe- 
matical Statistics, 23, 475; abstract. G 
Masuyama, Motosabur6é (1942): “The Bienaymé-Tchebycheff inequality for 
Hermitic tensors,” Proceedings of the Physical-Mathematical Society of 
Japan (8), 24, 409-11. C [46-310] 
(1951): “An improved binomial probability paper and its use with 
tables,” Reports of Statistical Application Research. Union of Japanese 
Scientists and Engineers, 1, 15-22. NX [62-96] 
Mathen, K. K. (1946): “A criterion for testing whether two samples have come 
from the same population without assuming the nature of the population,” 
Sankhya 7, 329. F [47-44] 
Mathisen, Harold C. (1943): “A method of testing the hypothesis that two 
samples are from the same population,” Annals of Mathematical Statistics, 

14, 188-94. FNP [44-128] 
Mathematical Centre (1952): Auziliary table for Wilcozon’s two sample test, 
Report R132/S86 Mathematical Centre, Amsterdam, 35 pp. GP [62-189] 
Mauldon, J. G. (1951): “Random division of an interval,” Proceedings of the 
Cambridge Philosophical Society, 47, 331-36. N [61-727] 
May, Joyce M. (1952): “Extended and corrected tables of the upper percentage 
points of the ‘Studentized’ range,” Biometrika, 39, 192-93. LP [62-961] 
McCarthy, Philip J. (1947): “Approximate solutions for means and variances in a 
certain class of box problems,” Annals of Mathematical Statistics, 18, 349-83. 

NP [60-41] 

McIntyre, G. A. (1952): “A method for unbiased selective sampling using ranked 
sets,” Australian Journal of Agricultural Research, 4, 385-90. LP 
McKay, A. J. (1935): “The distribution of the difference between the extreme 
observation and the sample mean in samples of n from a normal universe,” 
Biometrika, 27, 466-71. LPN 
McMillan, Brockway (1949): “Spread of minima of large samples,” Annals of 
Mathematical Statistics, 20, 444-47. LN [62-479] 
McNemar, Quinn (1947): “Note on the sampling error of the difference between 
correlated proportions or percentages,” Psychometrika, 12, 153-57. J 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 879 


., Medolaghi, P. (1909): “La teoria del rischio e le sue applicazioni,” Transactions 
of The Sixth International Actuarial Congresse, Vienna, 6, 723. Cc 
Meidell, Birge: (1921): “Quelques inégalités sur les fonctions monotones,” 
Skandinavisk Aktuarietidskrift, 4, 230-38. C 
—— (1922): “Sur un probléme du calcul des probabilités et les statistiques 
mathématiques,” Comptes Rendus (Paris), 175, 806-08. C 
—— (1923): “Sur la probabilité des erreurs,” Comptes Rendus (Paris), 176, 
280-82. C 
Meizler, D. G. (1949): “On a problem of B. V. Gnedenko,” Ukrainskit Mate- 
matiteskit Zurnal, 1, 67-84; (Russian). L [62-186] 
(1950): “On the limit distribution of the maximal term of a variational 
series,” Dopovidi Akademii Nauk Ukrain. Koi RSR, 1950, 3-10; (Ukrani- 
nian, Russian summary). LN [52-663] 
Midzuro, Hiroshi (1950): “On certain groups of inequalities. Confidence in- 
tervals for the mean,” Annals of the Institute of Statistical Mathematics, 
Tokyo, 2, 21-33. C [61-609] 
Mihalevié, V. S. (1952): “On the mutual disposition of two empirical distribution 
functions,” Doklady Akademit Nauk SSSR (N.S.), 85, 485-88; (Russian). 
FN [63-297] 
Mihoc, G. (1943): “Sur le probléme des itérations dans une suite d’épreuves,” 
Bulletin Mathématique de la Société Roumaine des Sciences, 45, 81-95. 
ahaa IN [46-19] 
(1949): “La loi limite de la probabilité des nombres des itérations de 
longueur donné,” Academia Republicii Populare Romane Bucharest Buletin 
Sttintific (A). N [62-363] 
von Mises, R. (1931a): “Ueber einige Abschaetzungen von Erwartungswerten,” 
Crelles Journal fuer die reine und angewandte Mathematik, 165, 184-93. C 
(1931b): Vorlesungen aus dem Gebiete der angewandten Mathematik 1. 
Band Wahrscheinlichkeitsrechnung, Leipzig und Wien, 309-35. E 
(1936): “La distribution de la plus grande de n valeurs,” Revue Mathé- 
matique de l'Union Interbalkanique, 1, 1-20. LN 
(1938): “Sur une inégalité pour les moments d’une distribution quasi- 
convexe,” Bulletin des Sciences Mathématiques (2), 62, 68-71. C 
(1939a): “An inequality for the moments of a discontinuous distribution,” 
Skandinavisk Aktuarietidskrift, 1939, 32-36. X [40-22] 
(1939b): “The limits of a distribution function if two expected values are 
given,” Annals of Mathematical Statistics, 10, 99-104. Cc 
(1947): “On the asymptotic distribution of differentiable statistical func- 
tions,” Annals of Mathematical Statistics, 18, 309-48. EN [48-194] 
Mood, A. M. (1940): “The distribution theory of runs,” Annals of Mathematical 
Statistics, 11, 367-92. N [41-228] 
(1941): “On the joint distribution of the medians in samples from a multi- 
variate population,” Annals of Mathematical Statistics, 12, 268-78. 
LN [42-172] 
(1949): “Tests of independence in contingency tables as unconditional 
tests,” Annals of Mathematical Statistics, 20, 114-16. H [49-466] 
(1950): Introduction to the theory of statistics, McGraw-Hill Book Co., 
New York; chapter 16. GJN [60-445] 
Moore, Geoffrey H., and Wallis, W. A. (1943): “Time series significance tests 














880 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


based on signs of differences,” Journal of the American Statistical Associat‘on, 





























38, 153-64. IN P [43-281] 
Moore, P. G. (1949): “A test for randomness in a sequence of two alternatives 
involving a 2X2 table,” Biometrika, 36, 305-16. IN [60-447 
Moran, P. A. P. (1947a): “Random associations on a lattice,” Proceedings of the 
Cambridge Philosophical Society, 43, 321-28. IN [47-592] 
(1947b): “On the method of paired comparisons,” Biometrika, 34, 363- 

65. MN [48-363] 
(1947c): “The random division of an interval,” Journal of the Royal 
Statistical Society (B), 9, 92-98. N [48-291] 
(1948a): “Rank correlation and permutation distributions,” Proceedings 

of the Cambridge Philosophical Society, 44, 142-44. JN [48-263] 
(1948b): “Rank correlation and product-moment correlation,” Bio- 
metrika, 35, 203-06. JP (48-601) 
(1948c): “The interpretation of statistical maps,” Journal of the Royal 
Statistical Society (B), 10, 243-51. IN [49-650] 
(1950a): “Recent developments in ranking theory,” Journal of the Royal 
Statistical Society (B), 12, 153-62. AJ [61-7 26] 
(1950): “A curvilinear ranking test,” Journal of the Royal Statistical 
Society (B), 12, 292-95. JNP [61-841] 
(1951a): “The random division of an interval, II,” Journal of the Royal 
Statistical Society (B), 13, 147-50. IN [62-667] 
(1951b): “Partial and multiple rank correlation,” Biometrika, 38, 26-32. 

JNP [62-1 41) 

Moriguti, Sigeiti (1951): “Extremal properties of extreme value distributions,” 
Annals of Mathematical Statistics, 22, 523-36. KLNP [62-570] 





(1952a): “A lower bound for a probability moment of any absolutely con- 
tinuous distribution with finite variance,” Annals of Mathematical Sta- 
tistics, 23, 286-89. PX [52-853] 
(1952b): “Bounds for second moments of the sample range,” Annals of 
Mathematical Statistics, 23, 475; abstract. L 
(1953): “A modification of Schwarz’s inequality with applications to dis- 

tributions,” Annals of Mathematical Statistics, 24, 107-13. PX 
Moroney, M. J. (1951): Facts from figures, A Pelican Book—Penguin Book 

Corp., Baltimore; chapters 15 and 18. EGHJ 
Moshman, Jack (1952): “Testing a straggler mean in a two-way classification 

using the range,” Annals of Mathematical Statistics, 23, 126-32. LP 
Moses, Lincoln E. (1952a): “Non-parametric statistics for psychological re- 

search,” Psychological Bulletin, 49, 122-43. AO 
(1952b): “A two-sample test,” Psychometrika, 17, 239-47. GP 
Mosteller, Frederick (1941): “Note on an application of runs to quality control 

charts,” Annals of Mathematical Statistics, 12, 228-32. AINOP [42-10] 
(1946): “On some useful ‘inefficient’ statistics,” Annals of Mathematical 
Statistics, 17, 377-408. LNP [47-477] 
(1948): “A k-sample slippage test for an extreme population,” Annals of 
Mathematical Statistics, 19, 58-65. ; GNP [48-454] 
(1951a): “Remarks on the method of paired comparisons: I. The least 
squares solution assuming equal standard deviations, and equal correla- 
tions,” Psychometrika, 16, 3-9. M 


























t 1953 


ton, 
~281] 
tives 
447) 
if the 
592] 
363- 
363] 
royal 
291) 
ings 
263] 
Bio- 
601] 
oyal 
650] 
oval 
7265) 
tical 
8 41] 
oyal 
5 67] 
-32. 
1 41| 
a 
570] 


Sta- 
353] 
s of 


dis- 
PX 
00k 
HJ 
ion 
LP 
re- 
AO 
rol 
10) 
cal 
77] 


5 4] 
ast 
la- 
M 





pIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 881 


_—_—— (1951b): “Remarks on the method of paired comparisons: II. The effect 
of an aberrant standard deviation when equal standard deviations and 
equal correlations are assumed,” Psychometrika, 16, 203-06. M 

__—. (1951c): “Remarks on the method of paired comparisons: II. A test of 
significance for paired comparisons when equal standard deviations and 
equal correlations are assumed,” Psychometrika, 16, 207-18. M 

__— (1953): “Statistical theory and research design,” Annual Review of 
Psychology, 4, 407-34. A 

Mosteller, Frederick, and Tukey, J. W. (1949): “The uses and usefulness of 
binomial probability paper,” Journal of the American Statistical Association, 


44, 174-212. NX 
——_— (1950): “Significance levels for a k-sample slippage test,” Annals of 
Mathematical Statistics, 21, 120-23. GNP [60-608] 
Mullemeister, Hermance (1945): “Mean lengths of line segments,” American 
Mathematical Monthly, 52, 250-52. N [48-232] 
Miinzner, Hans (1950): “Zur Abschitzung von Wehrscheinlichkeiten,” Mzt- 
teilungsblatt fur Mathematische Statistik, 2, 130-37. C [61-424] 
Murphy, R. B. (1948): “Non-parametric tolerance limits,” Annals of Mathemati- 
cal Statistics, 19, 581-89. _ DP [49-404] 
N 


Nair, A. N. Krishnan (1941): “Distribution of Student’s ‘’ and the correlation 
coefficient in samples from non-normal populations,” Sankhyd, 5, 383-400. 
GK [43-164] 

(1942): “On the probability of obtaining k sets of consecutive successes 
in n trials,” Mathematics Student, 10, 83-84. N [43-248] 
Nair, K. R. (1937): “A note on the exact distribution of An,” Sankhyd, 3, 171-74. 
EN 

(1938): “On Tippett’s ‘random sampling numbers’,” Sankhyd, 4, 65-72. 
IO 

(1940a): “The median in tests by randomization,” Sankhyd, 4, 543-50. 
GN [43-108] 

(1940b): “Table of confidence interval for the median in samples from 
any continuous population,” Sankhyd, 4, 551-58. GNP [43-165] 
(1948a): “The Studentized form of the extreme mean square test in the 
analysis of variance,” Biometrika, 35, 16-31. GLNP [48-601] 
(1948b): “The distribution of the extreme deviate from the sample mean 
and its Studentized form,” Biometrika, 35, 118-44. LNP [48-602] 
(1950): “Efficiencies of certain linear systematic statistics for estimating 
dispersion from normal samples,” Biometrika, 37, 182-83. K [61-116] 
(1952): “Tables of percentage points of the ‘Studentized’ extreme deviate 
from the sample mean,” Biometrika, 39, 189-91. LP [&2-961] 
Nair, K. R., and Banerjee, K. S. (1942): “A note on fitting of straight lines if 
both variables are subject to error,” Sankhyd, 6, 331. J [44-126] 
Nair, K. R., and Shrivastava, M. P. (1942): “On a simple method of curve 
fitting,” Sankhyd, 6, 121-32. JL [43-279] 
Narumi, Seimatsu (1923): “On further inequalities with possible application to 
problems in the theory of probability,” Biometrika, 15, 245-53. AC 
Neyman, J. (1935): “La vérification de l’hypothése concernant la loi de prob- 



































882 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


abilité d’une variable aléatoire,” Comptes Rendus (Paris), 203, 1047-49, 





EN 
(1937): “ ‘Smooth test’ for goodness of fit,” Skandinavisk Aktuarietid- 
skrift, 1937, 149-99. EN 





(1940): “Empirical comparison of the ‘smooth’ test for goodness of fit 
with the Pearson’s x* test,” Annals of Mathematical Statistics, 11, 478; ab- 
stract. E 
(1942): “Basic ideas and some recent results of the theory of testing 
statistical hypotheses,” Journal of the Royal Statistical Society (N.S.), 105, 
292-327; sections 20, 21. B [44-44] 
(1949): “Contribution to the theory of the x? test,” Proceedings of the 
Berkeley Symposium on Mathematical Statistics and Probability, University 
of California Press, 239-73. BE [49-388] 
(1951): “Existence of consistent estimates of the directional parameter 

in a linear structural relation between two variables,” Annals of Mathemati- 

cal Statistics, 22, 497-512. J (52-481) 
Neyman, J., and Pearson, E. S. (1928): “On the use and interpretation of certain 

test criteria for purposes of statistical inference. Part II,” Biometrika, 20A, 














263-94. EFH 
(1930): “Further notes on the x? distribution,” Biometrika, 22, 298-305. 
EP 


Neyman, J., and Scott, E. L. (1951): “On certain methods of estimating the 
linear structural relation,” Annals of Mathematical Statistics, 22, 352-61. 





J [62-259] 
(1952): “Correction to ‘On certain methods of estimating the linear struc- 
tural relation’,” Annals of Mathematical Statistics, 23, 135. J 


Nimeroff, Isadore (1952): “Analysis of goniophotometric reflection curves,” 
Journal of Research of the National Bureau of Standards, 48, 441-48. JO 
Noether, Gottfried E. (1948): “On confidence limits for quantiles,” Annals of 











Mathematical Statistics, 19, 416-19. G [49-135] 
(1949): “On a theorem by Wald and Wolfowitz,” Annals of Mathematical 
Statistics, 20, 455-58. BN [60-188] 
(1950): “Asymptotic properties of the Wald-Wolfowitz test of random- 

ness,” Annals of Mathematical Statistics, 21, 231-46. BIN [60-732] 
(1951): “On a connection between confidence and tolerance intervals,” 
Annals of Mathematical Statistics, 22, 603-04. DG (62-667) 
Norton, W. H. (1946): “Estimating the correlation coefficient,” Bulletin of the 
American Meteorological Society, 27, 589-90. ae 





Nybdlle, H. Cl. (1936): “On the statistical distinction between sets of two- 
dimensional observations,” Skandinavisk Aktuarietidskrift, 19, 1-26. FO 


Oo 


Offord, A. C. (1945): “An inequality for sums of independent random variables,” 
Proceedings of the London Mathematical Society (2), 48, 467-77. C [47-281] 
Ogawa, Junjiro (1951): “Contributions to the theory of systematic statistics, 


I,” Osaka Mathematical Journal, 3, 175-213. LP [52-762] 
Okamoto, Masashi (1952): “On a non-parametric test,” Osaka Mathematical 
Journal, 4, 77-85. E [52-190] 


Olds, Edwin G. (1935): “Distribution of greatest variates, least variates, and 








1953 


EN 
td- 
| N 


ub- 

E 
ng 
5, 
+4] 
the 


8] 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 883 


intervals of variation in samples from a rectangular universe,” Bulletin of 
the American Mathematical Society, 41, 297-304. LNP 
_—— (1938a): “Distributions of sums of squares of rank differences for small 
numbers of individuals,” Annals of Mathematical Statistics, 9, 133-48. JP 
—_—— (1938b): “A moment-generating function which is useful in solving cer- 
tain matching problems,” Bulletin American Mathematical Society, 44, 407- 
13. N 
—— (1949): “The 5% significance levels for sums of squares of rank differ- 
ences and a correction,” Annals of Mathematical Statistics, 20, 117-18. 
INP [49-467] 
—— (1952): “A note on the convolution of uniform distributions,” Annals of 
Mathematical Statistics, 23, 282-85. N [62-64] 
Olmstead, P. S. (1940): “Note on theoretical and observed distributions of 
repetitive occurrences,” Annals of Mathematical Statistics, 11, 363-66. 
EO [41-109] 
—— (1942): Review of “A significance test for time series,” by W. Allen 
Wallis and Geoffrey H. Moore, Journal of the American Statistical Associa- 





tion, 37, 152-53. I 
(1946): “Distribution of sample arrangements for runs up and down,” 
Annals of Mathematical Statistics, 17, 24-33. INP [46-457] 
Olmstead, P. S., and Tukey, J. W. (1947): “A corner test for association,” 
Annals of Mathematical Statistics, 18, 495-513. JNP [48-294] 
Pp 

Pastore, Nicholos (1950): “Some comments on ‘the use and misuse of the chi- 
square test’,” Psychological Bulletin, 47, 338-40. E 


Patnaik, P. B. (1948): “The power function of the test for the difference be- 

tween two proportions in a 2 X2 table,” Biometrika, 35, 157-75. H P [48-603] 
(1949): “The non-central x?- and F-distributions and their applications,” 
Biometrika, 36, 202-32. ENP [50-608] 
(1950): “The use of mean range as an estimator of variance in statistical 

tests,” Biometrika, 37, 78-87. LP [61-116] 
Paulson, Edward (1943): “A note on tolerance limits,” Annals of Mathematical 

Statistics, 14, 90-93. D [43-280] 
Pearson, E. S. (1931): “The analysis of variance in cases of non-normal varia- 

tion,” Biometrika, 23, 114-33. GK 
(1932): “The percentage limits for the distribution of range in samples 
from a normal population. (nS$100),” Biometrika, 24, 404-17. LP 
(1937): “Some aspects of the problem of randomization,” Biometrika, 29, 
53-64. AKL 
(1938a): “The probahility integral transformation for testing goodness of 
fit and combining independent tests of significance,” Biometrika, 30, 134— 
48. BE 
(1938b): “Some aspects of the problem of randomization II. An illustra- 
tion of ‘Student’s’ inquiry into the effect of ‘balancing’ in agricultural ex- 























periments,” Biometrika, 30, 159-79. AKL 
(1947): “The choice of statistical tests illustrated on the interpretation of 
data classed in a 2X2 table,” Biometrika, 34, 139-67. AHP [47-396] 


(1950a): “On questions raised by the combination of tests based on dis- 
continuous distributions,” Biometrika, 37, 383-98. X [51-429] 














884 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 





(1950b): “Some notes on the use of range,” Biometrika, 37, 88-92. 
GK (61-116) 

(1952): “Comparison of two approximations to the distribution of the 
range in small samples from normal populations,” Biometrika, 39, 130-26. 

LN (52-969) 

Pearson, E. S., assisted by Adyanthaya, N. K. (1929): “The distribution of fre. 
quency constants in small samples from non-normal symmetrical and skew 
populations,” Biometrika, 21, 259-86. K 
Pearson, E. S., and Haines, Joan (1935): “The use of range in place of standard 
deviation in small samples,” Journal of the Royal Statistical Society (B), 2 





83-98. L 
Pearson, E. S., and Hartley, H. O. (1943): “Tebles of the probability integral of 
the Studentized range,” Biometrika, 33, 89-99. LNP [45-92] 


Pearson, E. S., and Merrington, Maxine (1948): “2X2 tables; the power func- 
tion of the test on a randomized experiment,” Biomeirika, 35, 331-45. 

HP [49-388 

(1951): “Tables of the 5% and 0.5% points of Pearson curves (with argu- 

ment §;, and 6, expressed in standard measure,” Biometrika, 38, 4-10. KP 

Pearson, Karl (1900): “On the criterion that a given system of deviations from 

the probable in the case of a correlated system of variables is such that it 

can reasonably be supposed to have arisen from random sampling,” _Philo- 








sophical Magazine 5 Series, 50, 157-75. ENOP 
(1907): “On further methods of determining correlation,” Drapers’ 
Company Research Memoirs, Biometric Section IV, London. L 





(1911): “On the probability that two independent distributions of fre- 
quency are really samples from the same populations,” Biometrika, 8, 





250-54. FNO 
(1916a): “On the general theory of multiple contingency with special 
reference to partial contingency,” Biometrika, 11, 145-58. H 





(1916b): “On the application of ‘goodness of fit’ tables to test regression 
curves and theoretical curves used to describe observational or experimental 
data,” Biometrika, 11, 239-61. J 
(1919): “On generalized Tchebycheff theorems in the mathematical 
theory of statistics,” Biometrika, 12, 284-96. CP 
(1920): “On the probable errors of frequency constants, Part III” (Edi- 
torial), Biometrika, 13, 113-32. LNP 
(1922): “On the x? test of goodness of fit,” Biometrika, 14, 186-91. E 
(1924): “On the difference and the doublet tests for ascertaining whether 
two samples have been drawn from the same population,” Biometrika, 16, 
249-52. E 
(1931): Tables for statisticians and biometricians, Cambridge University 
Press, England. P 
(1932a): “Experimental discussion of the (x?, P) test for goodness of 
fit,” Biometrika, 24, 351-81. EO 
(1932b): “On the probability that two independent distributions of fre- 
suency are really samples from the same parent population,” Biometrika, 
24, 457-70. F 
(1933a): “On the parent population with independent variates which 
gives the minimum value of ¢? for a given sample,” Biometrika, 25, 134-46. 
E 
































| 1983 


“Il 6] 
the 
962] 
fre- 
kew 


lard 
), 2, 


il of 
92] 
ne- 
83) 
gu- 


t it 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 885 


__— (1933b): “On a method of determining whether a sample of size n sup- 
posed to have been drawn from a parent population having a known prob- 
ability integral has probably been drawn at random,” Biometrika, 25, 379- 


410. ENP 
—— (1934): “On a new method of determining ‘goodness of fit’,” Biometrika, 
26, 425-42. EP 
Pearson, Karl and Heron, David (1913): “On theories of association,” Bio- 
metrika, 9, 159-315. HO 


_ Peek, R. L. Jr., (1933): Some new theorems on limits of variation, Bell Telephone _ 


System Monograph B-789, New York. C 
Peirce, F. T. (1926): “Tensile tests for cotton yarns V: ‘the weakest link’— 
theorems on the strength of long and of composite specimens,” Journal of 


the Textile Institute, Transactions, 17, 355. LO 

_ Perks, Wilfred (1947): “A simple proof of Gauss’s inequality,” Journal of the 
Institute of Actuaries Students’ Society, 7, 38-41. C (48-15) 
Perlo, Victor (1933): “On the distribution of Student’s ratio for samples of three 
drawn from a rectangular distribution,” Biometrika, 25, 203-04. K 
Peters, Charles C. (1950): “The misuse of chi-square—a reply to Lewis and 
Burke,” Psychological Bulletin, 47, 331-37. E 


< 


Petrov, A. A. (1951): “The verification of an hypothesis concerning the normality 
of distributions by small samples,” Doklady Akademii Nauk SSSR, 76, 
355-58. Translation by Curtiss D. Benster, edited by D. Teichroew as Na- 
tional Bureau of Standards Report No, 2116, December 10, 1952. F [61-622] 

Picard, H. C. (1951): “A note on the maximum value of kurtosis,” Annals of 


Mathematical Statistics, 22, 480-82. X [62-141] 
Pillai, K. C. S. (1948): “A note on ordered samples,” Sankhyd, 8, 375-80. 
L [49-723] 





(1950): “On the distributions of midrange and semi-range in samples 
from a normal population,” Annals of Mathematical Statistics, 21, 100-05. 
LP [50-446] 

(1951): “Some notes on ordered samples from a normal population,” 
Sankhyd, 11, 23-28. GLP 
(1952): “On the distribution of ‘Studentized’ range,” Biometrika, 39, 
194-95. LN P [62-961] 
Pillai, K. C. S., and Ramachandran, K. V. (1953): “On the distribution of the 
ratio of the ith observation in an ordered sample from a normal population 

to an independent estimate of the standard deviation,” Annals of Mathe- 
matical Statistics, 24, 146; abstract. L 
Pitman, E. J. G. (1937a): “Significance tests which may be applied to samples 
from any populations,” Journal of the Royal Statistical Society (B), 4, 119- 

30. G 
(1937b): “Significance tests which may be applied to samples from any 
populations II. The correlation coefficient test,” Journal of the Royal Sta- 
tistical Society, 4, 225-32. J 
(1937c): “Significance tests which may be applied to samples from any 
populations. III. The analysis of variance test,” Biometrika, 29, 322-35. 
GNP 

(1939): “The estimation of the location and scale parameters of a con- 
tinuous population of any given form,” Biometrika, 30, 391-421. G 


























886 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 





(1948): Lecture notes on non-parametric statistics, Columbia University, 


N. Y. AB 
Plackett, R. L. (1947): “Limits of the ratio of mean range to standard deviation,” 
Biometrika, 34, 120-22. KPX [47-395] 


van der Plank, J. E. (1947): “A method for estimating the number of random 
groups of adjacent diseased plants in a homogeneous field,” Transactions of 
the Royal Society of South Africa, 31, 269-78. IN 
Pollaczek, Félix (1952): “Fonctions caractéristiques de certaines répartitions 
définies au moyen de la notion d’ordre. Application a la théorie des at- 
tentes,” Comptes Rendus (Paris), 234, 2334-36. N [6 2-957) 
Pompilj, G. (1952): “Osservazioni sull’omogamia: La trasformazione di Yule e jl 
limite della trasformazione ricorrente di Gini,” Universita di Roma. Istituto 
Nazionale di Alta Matematica. Rendiconti di Matematica e delle sue Ap- 
plicazioni, (5)9, 367-88. H [52-63] 
Possé, C. (1886): “Sur quelques applications des fractions continues algébriques,” 
St. Petersburg, 175. C 
Poti, S. Janardan (1950): “Power function of the chi-square test with special 
reference to analysis of blood group data,” Sankhyd, 10, 397-406. E 
Pridmore, W. A. (1944): “An application of statistical methods to a presswork 
operation (aircraft stringers),” (British) Ministry of Supply Advisory 
Service on Statistical Method and Quality Control. Technical Report No. 


Q.C./E/8 (18 June 1944). JO 

Q 
Quenouille, M. H. (1952): Associated measurements, Academic Press Inc., New 
York. J 

R 


Rajalakshma, D. V. (1943): “On the interval between the ranked individuals of 
samples taken from a rectangular population,” Journal of the Madras Uni- 
versity Section B, 15, 31-44. N [46-236] 

Rao, C. Radhakrishna (1952): Advanced statistical methods in biometric research, 
John Wiley and Sons, New York, 1-390; pp. 191-205. H [&3-388] 

Reiersél, Olav (1944): “Measures of departure from symmetry,” Skandinavisk 
Aktuarietidskrift, 27, 229-34. X [46-211] 

Rényi, Alfréd (1951): “New results in the field of probability,” Magyar Tudo- 
manyos Akadémia Mathematikai és Fizikai Osztaély Kézlemenyei, 2, 125-39; 
discussion 140-44; see page 128; (Hungarian). A [53-1 21] 

Reynolds, John H. (1952): “Two textile applications of the chi-square control 
chart,” Teztile Division Supplement to Industrial Quality Control 1, 49-54. 

0 

Rhodes, E. C. (1924): “On the problem whether two given samples can be sup- 
posed to have been drawn from the same population,” Biometrika, 16, 239- 
48. F 

Rider, Paul R. (1929): “On the distribution of the ratio of mean to standard 
deviation in small samples from non-normal universes,” Biometrika, 21, 
124-41. KP 

(1950): “The distribution of ranges from a discrete rectangular popula- 

tion,” Proceedings of the International Congress of Mathematicians, 1, p. 583; 

published by the American Mathematical Society, 1952. LN 











R 1953 


versity, 
AB 
ation,” 
47-395 
andom 
ions of 
IN 
titions 
les at- 
2-957) 
ile e i] 
stituto 
é A p- 
5 2-63] 
ques,” 
4 
pecial 
E 
swork 
visory 
| No. 
JO 


New 
J 


ils of 
Uni- 
- 236) 
arch, 
388] 
wwisk 
211] 
udo- 
-39; 

1 21} 

trol 

54. 


up- 
39- 


ard 
Bl, 
cP 
la- 
33; 
UN 








BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 887 


__—- (1951a): “The distribution of the range in samples from a discrete 
rectangular population,” Journal of the American Statistical Association, 
46, 375-78. LN (62-961) 

—— (1951b): “The distribution of the quotient of ranges in samples from a 
rectangular population,” Journal of the American Statistical Association, 
46, 502-07. LNP [&2-761] 

—— (1953): “The distribution of the product of ranges in samples from a 
rectangular population,” Bulletin of the American Mathematical Society, 
59, 175; abstract. L 


_Rijkoort, P. G. (1952): “A generalization of Wilcoxon’s test,” Proceedings 


Koninklijke Nederlandse Akademie van Wetenschappen Series A, 55, = In- 
dagationes Mathematicae, 14, 394-404. GP [53-391: 
Robbins, Herbert (1944a): “On distribution-free tolerance limits in random 
sampling,” Annals of Mathematical Statistics, 15, 214-216. D [45-9] 
(1944b): “On the expected values of two statistics,” Annals of Mathe- 








matical Statistics, 15, 321-23. N [45-162] 
(1945): “On the measure of a random set. II,” Annals of Mathematical 
Statistics, 16, 342-47. N [47-889] 


(1948): “Some remarks on the inequality of Tchebychef,” Courant Anni- 
versary Volume, Interscience Publishers, Inc., N. Y., 345-50. C [48-193] 

Robinson, Selby (1933): “An experiment regarding the x? test,” Annals of 
Mathematical Statistics, 4, 285-87. E 

Romanovski, V. I. (1940): “On the inductive conclusions in statistics,” Comptes 
Rendus (Doklady) de V Académie des Sciences URSS(N.S.), 27, 419-21. 








C [41-112] 

Rosander, A. C. (1942): !“The use of inversions as a test of random order,” 
Journal of the American Statistical Association, 37, 352-58. I 
Rosenblatt, Murray (1952a): “Remarks on a multivariate transformation,” 
Annals of Mathematical Statistics, 23, 470-72. E [62-189] 
(1952b): “Limit theorems associated with variants of the von Mises 
statistic,” Annals of Mathematical Statistics, 23, 617-23. EN 


Royden, H. L. (1952): Bounds on a distribution function when its first n mo- 
ments are given, Applied Mathematics and Statistics Laboratory, Stanford — 
University, California, Technical Report No. 16. Cc 


Ss 


/ Sakamoto, H. (1943): “On the distributions of the product and the quotient 


of the independent and uniformly distributed random variables,” Zhe 
Téhoku Mathematical Journal, 49, 243-60. N [47-623] 
Sandelius, Martin (1952): “A confidence interval for the smallest proportion of a 
binomial population,” Journal of the Royal Statistical Society (B), 14, 115- 


16. LN [63-488] 
Santal6, L. A. (1947): “On the first two moments of the measure of a random set,” 
Annals of Mathematical Statistics, 18, 37-49. N [48-389] 
VSavur, S. R. (1937): “The use of the median in tests of significance,” Proceedings 
of the Indian Academy of Sciences (A), 5, 564—76. G 
(1938): “The median versus mean or any other statistic in tests of sig- 
nificance,” Sankhyd, 4, 92; abstract. G 


Sawkins, D. T. (1941): “Remarks on goodness of fit of hypotheses and on Pear- 











888 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


son’s x* test,” Journal of the Proceedings of the Royal Society of New South 
_Wales, 75, 85-95. > paere Nea  EBYG2-175) 
(1947): “A new method of approximating the binomial and hypergeo- 
metric probabilities,” Journal of the Royal Statistical Society of New South 
Wales, 81, 38-47. N [48-193] 
Scheffé, Henry (1943a): “On a measure problem arising in the theory of non- 
parametric tests,” Annals of Mathematical Statistics, 14, 227-33. B [44-44] 
(1943b): “Statistical inference in the non-parametric case,” Annals of 











Mathematical Statistics, 14, 305-32. A [44-211] 
(1952): “An analysis of variance for paired comparison,” Journal of the 
American Statistical Association, 47, 381-400. MO [58-488] 


Scheffé, Henry, and Tukey, J. W. (1944): “A formula for sample sizes for popu- 
lation tolerance limits,” Annals of Mathematical Statistics, 15, 217. 





D (46-9) 
(1945): “Non-parametric estimation. I. Validation or order statistics,” 
Annals of Mathematical Statistics, 16, 187-92. BD [46-21] 


von Schelling, H. (1939): “Kennzeichen fiir eine rein zufillige Folge der Werte 
in einer zeitlich geordneten Beobachtungsreihe,” Astronomische_Nach- 
sichten, 269, 155-59. IN [40-249] 
Schmidt, Robert (1934): “Statistical analysis of one-dimensional distributions,” 
Annals of Mathematical Statistics, 5, 30-72. BP 
v. Schrutka, Lothar v. (1941): “Eine neue Einteiling der Permutationen,” Mathe- 
matische Annalen, 118, 246-50. N [46-32] 
Schultz, Frank G. (1945): “Recent developments in the statistical analysis of 
ranked data adapted to educational research,” The Journal of Ezperi-_ 
mental Education, 13, 149-52. GJO 
Schiitzenberger, Marcel-Paul (1948a): “Valeurs caractéristiques du coefficient 
de corrélation par rang de Kendall dans le cas général,” Comptes Rendus 
(Paris), 226, 2122-23. JN [49-134] 
(1948b): “Etude statistique d’un probléme de sociométrie,” Gallica 
_Biologica Acta, 1, 9 pp. LNO [48-602] 
(1948c): “An ABAC for the sample range,” Psychometrika, 13, 95-97. 
DLOP 
Schuyler, Garret L. (1948): “The ordering of n items assigned to k rank cate- 
gories by votes of m individuals,” Journal of the American Statistical Associ- 
ation, 43, 559-63. MN 
Scott, E. L. (1950): “Note on consistent estimates of the linear structural relation 
between two variables,” Annals of Mathematical Statistics, 21, 284-88. 











J [60-733] 
(1951): “On the consistency of certain estimates of the linear structural 
relation,” Annals of Mathematical Statistics, 22, 140-41. J 


Seal, H. L. (1947): “A historical note on the use of x? to test the adequacy of a 
mortality table graduation,” Journal of the Institute of Actuaries Students’ 
Society, 6, 185-87. E 

(1948): “A note on the x? smooth test,” Biometrika, 35, 202. EN [50-42] 

Sealy, E. H. (1943): “Specification of tolerances on components and final as- 
sembly. Methods of tightening tolerances on final assembly,” (British) 
Ministry of Supply Advisory Service on Quality Control. Technical Report 
No. Q.C./R./3 (19 May 1943), co 











1e~ 


A 
’ 
nl 
y 


— 


— tC 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 889 
Selberg, Henrik L. (1940a): “Uber eine Ungleichung der mathematischen Sta- 











tistik,” Skandinavisk Aktuarietidskrift, 23, 114-20. C [21-228] 
—— (1940b): “Zwei Ungleichungen zur Erginzung des Tchebycheffschen 
Lemmas,” Skandinavisk Aktuarietidskrift, 23, 121-25. C [41-227] 
(1940c): “Ueber eine Verschirfung der Tchebycheffschen Ungleichung,” 
Archiv for Mathematik og Naturvidenskab, 43, 30-32. C [40-245] 
(1942): “On an inequality in mathematical statistics,” Norsk Mate- 
matisk Tidsskrift, 24, 1-12, (Norwegian). C [47-199] 

El Shanawany, M. R. (1936): “An illustration of the accuracy of the x? approxi- 
mation,” Biometrika, 28, 179-87. ENP 
Sherman, B. (1950): “A random variable related to the spacing of sample val- 
ues,” Annals of Mathematical Statistics, 21, 339-61. EN [61-192] 
Shewhart, W. A. (1931): Economic control of quality of manufactured product, 
D. Van Nostrand Co., Inc., N. Y., pp. 328-38. O 
(1941): Contribution of statistics to the science of engineering, Bell Tele- 

phone System Monograph B-1319, New York. IO 


Shohat, J. A. (with an editorial note by Karl Pearson) (1929): “Inequalities for 
moments of frequency functions and for various statistical constants,” 


Biometrika, 21, 361-75. xX 
Shohat, J. A., and Tamarkin, J. D. (1943): The problem of moments, American 
Mathematical Society, New York, Chapter 3. C [44-5] 


Shone, K. J. (1949): “Relations between the standard deviation and the dis- 
tribution of range in non-normal populations,” Journal of the Royal Sta- 


tistical Society (B), 11, 85-88. KLN [50-260] 
Silberstein, L. (1945): “The probable number of aggregates in random distribu- 
tions of points,” Philosophical Magazine, 36, 319-36. IN [46-310] 


Sillitto, G. P. (1947): “The distribution of Kendall’s r coefficient of rank cor- 
relation in rankings containing ties,” Biometrika, 34, 36-40. JN P [47-476] 
———— (1949): “Note on approximations to the power function of the ‘2X2 
comparative trial’, ” Biometrika, 36, 347-52. HP [50-447] 
(1951): “Interrelations between certain linear systematic statistics of 
samples from any continuous population,” Biometrika, 38, 377-82. 





LP [52-64] 
Silverstone, H. (1950): “A note on the cumulants of Kendall’s S-distribution,” 
Biometrika, 37, 231-35. JIN P [51-344] 


Simpson, E. H. (1951): “The interpretation of interaction in contingency ta- 
bles,” Journal of the Royal Statistical Society (B), 13, 238-41 H [63-486]. 
Simpson, Paul B. (1951): “Note on the estimation of bivariate distribution 
function,” Annals of Mathematical Statistics, 22, 476-78. BE [62-142] 
Singh, Baikunth Nath (1952): “Use of complex Markoff’s chain in testing ran- 
domness,” Journal of the Indian Society of Agricultural Statistics, 4, 145-48. 

IN 

Skory, John (1952): “Automatic machine method of calculating contingency x?*,” 
Biometrics, 8, 380-82. H 


» Slutsky, Eugen (1925): “Uber stochastiche Asymptoten und Grenzwerte,” 


Metron, 5, 3-90. C 
Smirnov, N. V. (1935): “Uber die Verteilung des allgemeinen Gliedes in der 
Variationsreihe,” Metron, 12, 59-81, LN 














890 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 





(1936): “Sur la distribution de w* (Criterium de M. R. v. Mises),” 





Comptes Rendus (Paris), 202, 449-52. EN 
(1937): “Sur la distribution de w*,” Matematigeskit_Sbornik (Moscow) 
(NS), 2, 973-93. EN 





(1939a): “Sur les écarts de la courbe de distribution empirique,” Ree, 
Math. N.S. (Mateméaticeskii Sbornik), 6, (48) 3-26 (Russian. French sum- 
mary). FN [40-246] 
(1939b): “On the estimation of the discrepancy between empirical curves 
of distribution for two independent samples,” Bulletin of Mathematics, 
University of Moscow, 2, 16 pp. FN [40-3 45] 
(1947): “Sur un critére de symmétrie de la loi de distribution d’une vari- 
able aléatoire,” Comptes Rendus(Doklady) del’ Académie des Sciences URSS 











(N.S.), 56, 11-14. N PX [48-46] 
(1948): “Table for estimating the goodness of fit of empirical distribu- 
tions,” Annals of Mathematical Statistics, 19, 279-81. EFP [48-599] 





(1949): Limit distributions for the terms of a variational series, American 
Mathematical Society, Translation Number 67, Providence, 1952, 1-64. 
LN [50-605] 
Smith, B. Babington (1941): “Note on an alternant suggested by statistical 
theory,” Edinburgh Mathematica: Notes, 32, 19-22. J [43-106] 
Smith, C. A. B. (1951): “A test for heterogeneity of proportions,” Annals of 
Eugenics, 16, 16-25. H [52-260] 
Smith, C. D. (1930): “On generalized Tchebycheff inequalities in mathematical 
statistics,” American Journal of Mathematics, 52, 109-26. C 
(1939): “On Tchebycheff approximation for decreasing functions,” An- 
nals of Mathematical Statistics, 10, 190—92. C 
(1951): “Some probability estimates for contingency tables,” Mathe- 
matics Magazine, 25, 59-62. H 
Snedecor, George W. (1953): “Queries,” Biometrics, 9, 107-10. G 
Snedecor, George W., and Irwin, M. R. (1933): “On the chi-square test for 
homogeneity,” Jowa State College Journal of Science, 8, 75-81. H 
Spearman, C. (1904): “The proof and measurement of association between two 
things,” American Journal of Psychology, 15, 72-101. J 
Spurr, William A. (1951): “A short-cut measure of correlation,” Journal of the 
American Statistical Association, 46, 89-94. J 
Steffensen, J. F. (1941): “On the w test of dependence between statistical vari- 
ables,” Skandinavisk Aktuarietidskrift, 24, 13-33. J [42-5] 
Steinhause, H. (1948): “Elementary inequalities between the expected values of 


current estimates of variance,” Colloguium Mathematicum, 1, 312-21. 

















K [49-724] 

Stevens, W. L. (1937): “Significance of grouping and a test for univular twins 
in mice,” Annals of Eugenics, 8, 57-73. INP 
(1938): “Distribution of entries in a contingency table,” Annals of 
Eugenics, 8, 238-44. HN 
(1948): “Control by gauging,” Journal of the Royal Statistical Society 

(B), 10, 54-108. G 
(1951): “Mean and variance of an entry in a contingency table,” Bio- 
metrika, 38, 468-70. HN [52-853] 


Stewart, W. Mac (1941): “A note on the power of the sign test,” Annals of 
Mathematical Statistics, 12, 236-39. GNP [42-8] 








R 1953 


ses),” 
EN 


scow) 


Ree. 
sum- 
-246] 
Irves 
atics, 
$45] 
vari- 
RSS 
- 46] 
‘ibu- 
599] 
ican 


605] 
tical 
105] 
s of 
260] 
ical 


An- 


the- 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 891 


Stieltjes, T. J. (1884): “Sur l’évaluation approchée des intégrales,” Comptes 
Rendus (Paris), 97, 740-42, 798-99. Cc 
_——_— (1894-1895): “Recherches sur les fractions continues,” Annales de la 
Faculté des Sciences de Toulouse Série 1, 8, T 1-122; Série I, 9, A5-47. Cc 
Stouffer, Samuel A., et al. (1950): Measurements and Prediction, Princeton 








University Press, Princeton, N. J. MO 
Stuart, A. (1950): “The cumulants of the first n natural numbers,” Biometrika, 
37, 446. N [61-344] 
(1951): “An application of the distribution of the ranking concordance co- 
efficient,” Biometrika, 38, 33-42. GMN 
(1952): “The power of two difference-sign tests,” Journal of the American 
Statistical Association, 47, 416-24. INP 
“Student” (1921): “An experimental determination of the probable error of Dr. 
Spearman’s correlation coefficients,” Biometrika, 12, 263-82. JN 
Sukhatme, B. V. (1949a): “Random association of points on a lattice,” Journal 
of the Indian Society of Agricultural Statistics, 2, 60-85. IN [50-674] 
~——— (1949b): “Variance of triplets,” Nature, 164, 841. IN 





(1951): “On certain probability distributions arising from points on a 
line,” Journal of the Royal Statistical Society (B), 13, 219-32. NP [62-69] 

Sukhatme, P. V. (1938): “On the distribution of x? in samples of the Poisson 
series,” Journal of the Royal Statistical Society (B), 5, 75-79. EP 

Swaroop, Satya (1938): “Tables of the exact values of probabilities for testing the 
significance of differences between proportions based on pairs of small 
samples,” Sankhyd, 4, 73-84. HP 

Swed, Frieda S., and Eisenhart, C. (1943): “Tables for testing randomness of 
grouping in a sequence of alternatives,” Annals of Mathematical Statistics, 
14, 66-87. P [43-223] 

Swineford, Frances (1946): “Graphical and tabular aids for determining sample 
size when planning experiments which involve comparisons of percentages,” 
Psychometrika, 11, 43-49. 

(1948): “A table for estimating the significance of the difference between 














correlated percentages,” Psychometrika, 13, 23-25. HP [48-363] 
T 

.Tchebycheff, P. L. (1867): “Des valeurs moyennes,” Journal de Mathématiques 
(Liouville, 2nd series), 12, 177-84. Cc 
(1874): “Sur les valeurs limites des intégrales,” Journal de Mathématiques 

pures et appliquées (2), 19, 157-60. C 
(1890): “Sur deux théorémes relatifs aux probabilités,” Acta Mathe-_ 

matica, 14, 305-15. Cc 
(1899): Oeuvres, St. Petersburgh, 2 vols.: Pp. 687-694, vol. I; pp. 183-185, 

vol. II. Cc 


Terpstra, T. J. (1952a): “The asymptotic normality and consistency of Kendall’s 
test against trend, when ties are present in one ranking,” Proceedings Ko- 
ninklijke Nederlandse Akademie van Wetenschappen (A), 55,= Indagationes 
Mathematicae, 14, 327-33. IN [68-6 4] 

(1952b): “A confidence interval for the probability that a normally dis- 





tributed variable exceeds a given value, based on the mean and the mean 
range of a number of samples,” Applied Scientific Research, 3, Section, A 
297-307. L 














892 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Terry, M. E. (1952): “Some rank order tests which are most powerful against 
specific parametric alternatives,” Annals of Mathematical Statistics, 23, 346- 
66. GP [62-190] 
Terry, M. E., Bradley, R. A., and Davis, L. L. (1952): “New designs and tech- 
niques for organoleptic testing,” Food Technology, 6, 250-54. MO 
Theil, H. (1949): “A note on the inequality of Camp and Meidell,” Statistica, 
Rijswijk, 3, 201-8; (Dutch, English summary). C [61-3 4) 
(1950a): “A rank-invariant method of linear and polynomial regression 
analysis. I,” Proceedings Koninklijke Nederlandse Akademie van Wetenschap- 
pen, 53, 1397-1412=Indagationes Mathematicae, 12, 85-91. J (61-117) 
(1950b): “A rank-invariant method of linear and polynomial regression 
analysis. II,” Proceedings Koninklijke Nederlandse Akademie van Weten- 
schappen, 53, 386-92 and 521-25=Indagationes Mathematicae, 12, 173-77. 
J (61-117 
(1951): “Distribution-free methods in the regression analysis of two 
variables,” Statistica Rijswijk, 5, 97-117; (Dutch, English summary). 
J [52-671] 
Thomas, Harold A., Jr. (1948): “Frequency of minor floods,” Journal of the 
Boston Society of Civil Engineers, 35, 425-42. LOP 
Thomas, Marjorie (1951): “Some tests for randomness in plant populations,” 
Biometrika, 38, 102-11. IP 
Thompson, William R. (1936): “On confidence ranges for the median and other 
expectation distributions for populations of unknown distribution form,” 
Annals of Mathematical Statistics, 7, 122-28. GN 
(1938): “Biological applications of normal range and associated signifi- 
cance tests in ignorance of original distribution forms,” Annals of Mathe- 

















matical Statistics, 9, 281-87. LN 
Thornton, G. R. (1943): “The significance of rank difference coefficients of cor- 
relation,” Psychometrika, 8, 211-17. JP 
Tintner, Gerhard (1952): Econometrics, John Wiley and Sons, Inc., New York; 
sections 8.4, 9.4, 10.1.1. IO 
Tippett, L. H. C. (1927): Random sampling numbers, Tracts for Computers No. 
15, Cambridge University Press, London N.W.1. IP 
(1952): The method of statistics, John Wiley and Sons, Inc., New York; 

4th edition, pp. 126-40, 273-75, 363-65. AEFGJ [63-296] 
Tocher, K. D. (1950): “Extension of the Neyman-Pearson theory of tests to dis- 
continuous variates,” Biometrika, 37, 130—44. H 
Todd, H. (1940): “A note on random association in a square point lattice,” 
Journal of the Royal Statistical Society (B), 3, 78-82. I 
Treloar, Alan E. (1942): Correlation analysis, Burgess Publishing Co., Minneapo- 
lis, Minn. J [43-220] 


Tsao, Chia Kuei (1952): “An extension of Massey’s distribution of the maximum 
deviation between two sample cumulative step functions,” Annals of Math- 


ematical Statistics, 23, 638; abstract. F 
Tukey, J. W. (1946): “An inequality for deviations from medians,” Annals of 
Mathematical Statistics, 17, 75-78. X [46-462] 





(1947): “Non-parametric estimation II. Statistical equivalent blocks and 
tolerance regions—the continuous case,” Annals of Mathematical Statistics, 
18, 529-39. D [48-296] 








inst 
346~ 
190) 
ech- 
MO 
‘ica, 
S34] 
sion 
ap- 


sion 


71] 


” 
8, 


IP 


ler 
” 


1 


ifi- 
he- 


T- 


5S TFT ™~ ss 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 893 


(1948a): “Nonparametric estimation, III. Statistically equivalent blocks 
and multivariate tolerance regions—the discontinuous case,” Annals of 








Mathematical Statistics, 19, 30-39. D [45-453] 
(1948b): “Some elementary problems of importance to small sample prac- 
tice,” Human Biology, 20, 205-14. K 





(1948c): Comparing individual means in the analysis of variance, Mimeo- 
graphed Report No. 6, Statistical Research Group, Princeton University; 














appendix. LNP 
(1949a): “Moments of random group size distributions,” Annals of 
Mathematical Statistics, 20, 523-39. N [60-37 4] 
(1949b): The simplest signed-rank tests, Mimeographed Report No. 17, 
Statistical Research Group, Princeton University. GP 
(1949c): A problem in the distribution of rankings, Mimeographed Report 
No. 40, Statistical Research Group, Princeton University. GPN 


(1950): “Estimation in the alternative family of distributions,” Pro- 
ceedings of the International Congress of Mathematicians 1, p. 586; published 





by the American Mathematical Society, 1952. GK 
Tweedie, M. C. K. (1953): “Bias in estimation by interval.” Annals of Mathe- 
matical Statistics, 24, 139; abstract. G 

U 
Uhler, Horace S. (1952): “Many-figure approximations for ~/2, 7/3, 0/4, and 
1/9 with x? data,” Scripta Mathematica, 18, 173-76. O [53-411] 
Uspensky, J. V. (1937): Introduction to mathematical probability, McGraw-Hill 
Book Co., New York; chapter 10 and pp. 373-80. Cc 

V 


van der Vaart, R. H. (1950a): “Some remarks on the power function of Wilcoxon’s 
test for the problem of two samples. I and II,” Proceedings Koninklijke 
Nederlandse Akademie van Wetenschappen, 53, 494-506 and 507-20 = Inda- 
gationes Mathematicae, 12, 146-58 and 159-72. F [61-38] 
(1950b): Directions for the use of Wilcoxon’s test, Mathematisch Centrum 
Amsterdam Rapport.§ 32 (M4), pp. 1-16; (Dutch). GP [62-143] 
Ville, Jean (1943a): “Sur un critére d’indépendance,” Comptes Rendus (Paris), 
216, 552-53. IN [44-206] 
(1943b): “Sur l’application, & un critére d’indépendance, du dénombre- 
ment des inversions présentées par une permutation,” Comptes Rendus 








(Paris) 217, 41-42. I [45-8] 
Vora, Shanti A. (1951): “Bounds on the distribution of chi-square,” Sankhyd, 
11, 365-78. EN [62-189] 


Votaw, David F., Jr. (1946): “The probability distribution of the measure of a 
random linear set,” Annals of Mathematical Statistics, 17, 240-44. 
N [47-281] 


WwW 


van der Waerden, B. L. (1952): “Order tests for the two-sample problem and 
their power,” Proceedings Koninklijke Nederlandse Akademie van Weten- 
schappen, 55, 453-58. GNP 
(1953): “Order tests for the two-sample problem and their power (Cor- 











894 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1983 


rigenda),” Proceedings Koninklijke Nederlandse Akademie van Wetenschap- 
pen, 56, 80. G 
Wald, A. (1938): “Generalization of the inequality of Markoff,” Annals of 
Mathematical Statistics, 9, 244-55. C 
(1939): “Limits of a distribution function determined by absolute mo- 

ments and inequalities satisfied by absolute moments,” Transactions of the 
American Mathematical Society, 46, 280-306. C [40-1 4] 
(1940): “The fitting of straight lines if both variables are subject to 
error,” Annals of Mathematical Statistics, 11, 284-300. J [41-108] 
(1943): “An extension of Wilks’ method for setting tolerance limits,” 
Annals of Mathematical Statistics, 14, 45-55. DP [43-222] 
(1947): “Limit distribution of the maximum and minimum of successive 
cumulative sums of random variables,” Bulletin of the American Mathe- 
matical Society, 53, 142-53. N [47-471] 
Wald, A., and Wolfowitz, J. (1939): “Confidence limits for continuous distribu- 
tion functions,” Annals of Mathematical Statistics, 10, 105-18. E 
(1940): “On a test whether two samples are from the same population,” 
Annals of Mathematical Statistics, 11, 147-62. FN [40-348] 
(1941): “Note on confidence limits for continuous distribution functions,” 

Annals of Mathematical Statistics, 12, 118-19. EF [42-9] 
(1943): “An exact test for randomness in the non-parametric case 

based on serial correlation,” Annals of Mathematical Statistics, 14, 378-88. 

IN [44-211] 

(1944): “Statistical tests based on permutations of the observations,” 
Annals of Mathematical Statistics, 15, 358-72. BN [45-163] 
Waldapfel, L. (1943): “Uber das Profil der Permutationen,” Matematikai Fiziko 
_Lapok, 50, 257-61: (Hungarian, German summary). IN [47-190] 
Walker, A. M. (1950): “Note on a generalization of the large sample goodness of 
fit for linear antoregressive schemes,” Journal of the Royal Statistical 
Society (B), 12, 102-07. EI (61-51 2] 
(1952): “Some properties of the asymptotic power functions of goodness- 

of-fit tests for linear autoregressive schemes,” Journal of the Royal Statistical 
Society (B), 14, 117-34. EI 
Walker, Helen M. (1929): “Certain mathematical questions suggested by the 
true-false test,” American Mathematical Monthly, 34, 503-15. IN 
Wallis, W. Allen (1939): “The correlation ratio for ranked data,” Journal of the 
American Statistical Association, 34, 533-38. GN 
(1942): “Compounding probabilities from independent significance 

tests,” Econometrica, 10, 229-48. X [43-222] 
(1952): “Rough-and-ready statistical tests,” Industrial Quality Control, 

8, No. 5, 35-40. A 
Wallis, W. Allen, and Moore, Geoffrey H. (1941a): A significance test for time 
series, Technical paper No. 1, National Bureau of Economic Research, New 
York. INP [42-176] 
(1941b): “A significance test for time series,” Journal of the American 
Statistical Association, 36, 401-09. INP [42-176] 
Walsh, John E. (1946a): “Some significance tests based on order statistics,” 
Annals of Mathematical Statistics, 17, 44-52. GLP [46-464] 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 895 


(1946b): “Some order statistic distributions for samples of size four,” 
Annals of Mathematical Statistics, 17, 246-48. L [47-43] 
(1946c): On the power function of a sign test formed by using subsamples, 
Undated Mimeographed report from Douglas Aircraft Co., Inc., Santa 
Monica, Calif. G 
(1946d): “On the power function of the sign test for slippage of means,” 
Annals of Mathematical Statistics, 17, 358-62. GP [47-42] 
(1947): Some significance tests for the mean using the sample range and 
midrange, Mimeographed report from Douglas Aircraft Co., Inc., Santa 
Monica, Calif. G 
(1949a): “Some significance tests for the median which are valid under 
very general conditions,” Annals of Mathematical Statistics, 20, 64-81. 
GNP [49-564] 
(1949b): “On the range-midrange test and some tests with bounded sig- 
nificance levels,” Annals of Mathematical Statistics, 20, 257-67. GP [50-191] 
(1949¢c): “Applications of some significance tests for the median which are 
valid under very general conditions,” Journal of the American Statistical 
Association, 44, 342-55. GP 
(1949d): “Some comments on the efficiency of significance tests,” Human 
Biology, 21, 205-17. Mares 
(1950a): “On a generalization of the Behrens-Fisher problem” Human 
Biology, 22, 125-35. G 
(1950b): “Some estimates and tests based on the r smallest values in a 
sample,” Annals of Mathematical Statistics, 21, 386-97. GP [61-428] 
(1950c): “Some nonparametric tests of whether the largest observations 
of a set are too large or too small,” Annals of Mathematical Statistics, 21, 
583-92. GP [61-842] 
(195la): “A large sample f-statistic which is insensitive to non-random- 
ness,” Journal of the American Statistical Association, 46, 79-88. G 
(1951b): “Some bounded significance level properties of the equal-tail 
sign test,” Annals of Mathematical Statistics, 22, 408-17. BGP [53-298] 
(1952a): “Large-sample confidence intervals for density function values 
at percentage points,” Annals of Mathematical Statistics, 23, 302; abstract. 
G 
(1952b): “Some nonparametric tests for Student’s hypothesis in experi- 
mental designs,” Journal of the American Statistical Association, 47, 401-15. 
G [53-488] 
(1953): “Correction to ‘some nonparametric tests of whether the largest 
observations of a set are too large or too small’, ” Annals of Mathematical 
Statistics, 24, 134-35. G 
. Walter, Edward (1951): “Uber einige nichtparametrische Testverfahren,” Mit- 
_teilungsblatt fiir Mathematische Statistik, 3, 31-44, 73-92. A [52-368] 
Waschakidse, D. (1938): “Uber das Maximum der Abweichung des theoretischen 
Verteilungsgesetzes von der entsprechenden empirischen Kurve,” Trav. 
Inst. Math. Tbilissi, 4, 101-20 u. dtsch. Zusammenfassung 120-22; (Rus- 
sian). EN 
Watson, Geof. (1952): “Extreme value theory for m-dependent stationary se- 
quence of continuous random variables,” Annals of Mathematical Statistics, 
23, 644-45; abstract. L 





896 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


Weichelt, John A. (1946): “A first-order method for estimating correlation co- 
efficients,” Psychometrika, 11, 215-21. JL [47-28 2) 
Welch, B. L. (1937): “On the z-test in randomized blocks and Latin squares,” 
Biometrika, 29, 21-52. GNP 
(1938): “On tests for homogeneity,” Biometrika, 30, 149-58. GNP 
Westenberg, J. (1948): “Significance test for median and interquartile range in 
samples from continuous populations of any form,” Nederlandse Akademie 
Wetenschappen, Proceedings, 51, 252-61. GP [49-62 
(1950a): “A tabulation of the median test for unequal samples,” Neder- 

landse Akademie Wetenschappen, Proceedings, 53, 77-82. GP [50-732] 
(1950b): “The median and interquartile range test applied to frequency 
distributions plotted on a circular axis,” Proceedings Koninklijke Neder- 
landse Akademie van Wetenschappen, 53, 1034-37 =IJndagationes Mathe- 
maticae, 12, 378-81. G 
(1952): “A tabulation of the median test with comments and corrections 

to previous papers,” Proceedings Koninklijke Akademie van Wetenschappen 

Ser. A., 55=Indagationes Mathematicae, 14, 10-15. G [62-664] 
White, Colin (1952): “The use of ranks in a test of significance for comparing 
two treatments,” Biometrics, 8, 33-41. GNOP 
Whitfield, J. W. (1947): “Rank correlation between two variables, one of which 
is ranked, the other dichotomous,” Biometrika, 34, 292-96. J [48-453] 
(1949): “Intra-class rank correlation,” Biometrika, 36, 463-67. JP 
(1950): “Uses of the ranking method in psychology,” Journal of the Royal 
Statistical Society (B), 12, 163-70. JO 
Whitney, D. R. (1948): A comparison of the power of non-parametric tests and 
tests based on the normal distribution under non-normal alternatives, Disserta- 
tion, The Ohio State University. G 
(1951): “A bivariate extension of the U statistics,” Annals of Mathe- 
matical Statistics, 22, 274-82. GP [61-840] 
Whitworth, W. A. (1945): DCC exercises including hints for the solutions of all 
the questions in CHOICE and CHANCE, G. E. Stochert, New York. N 
van Wijngarden, A. (1950): “Table of the cumulative symmetric binomial dis- 
tribution,” Nederlandse Akademie Wetenschappen, Proceedings, 53, 857-68. 

P [61-55] 

Wilcoxon, Frank (1945): “Individual comparisons by ranking methods,” Bio- 
metrics, 1, 80-83. GP 
(1946): “Individual comparisons of grouped data by ranking methods,” 
Journal of Economic Entomology, 3¢, 269-70. GOP 
(1947): “Probability tables for individual comparisons by ranking meth- 

ods,” Biometrics, 3, 119-22. GP [48-603] 
(1949): Some rapid approximate statistical procedures, American Cynamid 

Co., Stanford Research Laboratories (July 1949), 16 pp. GP 
(1950): “Some rapid approximate statistical procedures,” Annals of the 

New York Academy of Sciences, 52, 808-14. G 
Wilkens, J. Ernest Jr., (1944): “A note on skewness and kurtosis,” Annals of 
Mathematical Statistics, 15, 333-35. X [46-91] 
Wilks, S. S. (1935): “The likelihood test of independence in contingency tables,” 
Annals of Mathematical Statistics, 6, 190-96. HN 
(1940): “Confidence limits and critical differences between percentages,” 


Public Opinion Quarterly, 4, 332-38. 





—eeT re VT se T 


Ss 


rN ew Ow GS NS 


SF Ge OS = Vo 


BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 897 


(1941): “Determination of sample sizes for setting tolerance limits, 
Annals of Mathematical Statistics, 12, 91-96. D [42-9] 
(1942): “Statistical prediction with special reference to the problem of 
tolerance limits,” Annals of Mathematical Statistics, 13,400-09. DP [43-166] 
(1943): Mathematical statistics, Princeton University Press, Princeton, 
N. J., pp. 30-31, 200-25. AN [44-41] 
(1948): “Order statistics,” Bulletin of the American Mathematical Soci- 
ety, 54, 6-50. AB [48-601] 
Williams, C. Arthur Jr., (1950): “On the choice of the number and width of 
classes for the chi-square test of goodness of fit,” Journal of the American 
Statistical Association, 45, 77-86. EP 
Wilson, Edwin B. (1941): “The controlled experiment and the four-fold table,” 
(1942): “On contingency tables,” Proceedings of the National Academy 
of Sciences (USA), 28, 94-100. H [43-26] 
(1952): An introduction to scientific research, McGraw-Hill Book Co., 
New York; pp. 185-88, 197-202, 229-31, 247-50, 266-68. CDEGIP 
Wilson, Edwin B., and Worcester, Jane (1942a): “Contingency tables,” Pro- 
ceedings of the National Academy of Sciences (USA), 28, 378-84. H [43-106] 
(1942b): “The association of three attributes,” Proceedings of the Na- 
tional Academy of Sciences (USA), 28, 384-90. H [43-106] 
Winsten, C. B. (1946): “Inequalities in terms of mean range,” Biometrika, 33, 
283-95. CPX [47-43] 
Wishart, J., and Hirschfeld, H. O. (1936): “A theorem concerning the distribu- 
tion of joins between line segments,” The Journal of the London Mathe- 
matical Saciely, 11, 227-35. N 
Wold, Herman (1938): A study in the analysis of stationary time series, Almqvist 
and Wiksells Boktryckeri-A.-B., Uppsala; see Appendix A. E 
(1948): Random normal deviates. 25,000 items compiled from tract no. 
XXIV (M. G. Kendall and B. Babington Smith’s tables of random sampling 
numbers),” Cambridge University Press, London N.W. 1. I [49-563] 
Wolfowitz, J. (1942): “Additive partition functions and a class of statistical 
hypotheses,” Annals of Mathematical Statistics, 13, 247-79. 
BEFIN [43-107] 
(1943): “On the theory of runs with some applications to quality con- 
trol,” Annals of Mathematical Statistics, 14, 280-88. AJ [44-40] 
(1944a): “Note on runs of consecutive elements,” Annals of Mathe- 
matical Statistics, 15, 97-98. N [45-5] 
(1944b): “Asymptotic distribution of runs up and down,” Annals of 
Mathematical Statistics, 15, 163-72. N [45-8] 
(1949): “Non-parametric statistical inference,” Proceedings of the Berke- 
ley Symposium on Mathematical Statistics and Probability, University of 
California Press, Berkeley, 93-113. AF [49-387] 
(1952): “Abraham Wald, 1902-1950,” Annals of Mathematical Statistics, 
23, 1-13; see page 11. A 
Woodbury, Max A. (1940): “Rank correlation when there are equal variates,” 
Annals of Mathematical Statistics, 11, 358-62. JNP [41-110] 
Woodruff, Ralph S. (1952): “Confidence intervals for medians and other position 
measures,” Journal of the American Statistical Association, 47, 635-46. 
GL [63-391] 





898 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 
Y 


Yamanouchi, Ziro (1949): “Estimates of mean and standard deviation of a 
normal distribution from linear combinations of some chosen order statis- 
tics,” Bulletin of Mathematical Statistics (edited by Research Association of 
Statistical Sciences), 3, 52-57; (Japanese, English abstract). L 

Yates, F. (1934): “Contingency tables involving small numbers and the 2 
test,” Journal of the Royal Statistical Society (B), 1, 217-35. H 

(1948): “The analysis of contingency tables with groupings based on 
quantitative characters,” Biometrika, 35, 176-81. H 
(1951): “Bases logiques de la planification des expériences,” Annales de 
V Institut Henri Poincaré, 12, 97-112. G [52-669] 

Youden, W. J. (1950): “Index for rating diagnostic tests,” Cancer, 3, 32-35. H 

Young, L. C. (1941): “On randomness in ordered sequences,” Annals of Mathe- 
matical Statistics, 12, 293-300. INP [43-17 4] 

Yule, G. Udny (1912): “On the methods of measuring association between two 
attributes,” Journal of the Royal Statistical Society, 75, 579-652. H 

(1922): “On the application of the x? method to association and con- 
tingeney tables, with experimental illustrations,” Journal of the Royal 
Statistical Society, 85, 95-104. EHO 

(1938): “A test of Tippett’s random sampling numbers,” Journal of the 
Royal Statistical Society, 101, 167-72. IO 

Yule, G. Udny, and Kendall, M. G. (1950): An introduction to the theory of 
statistics, Hafner Publishing Company, New York, 14th edition; pp. 19-68, 
258-72, 268-69, 459-81. EHIJ 


Z 


Zelen, Marvin (1951): “Bounds on a distribution which are a function of mo- 
ments,” Annals of Mathematical Statistics, 22, 315; abstract. C 
Zubin, Joseph (1939): “Nomographs for determining the significance of the 
differences between the frequencies of events in two contrasted series or 
groups,” Journal of the American Statistical Association, 34, 539-44. GP 


PAPERS BY ANNOTATION LETTER 


One of the main uses of this bibliography is to aid in preparing spe- 
cialized bibliographies for particular problems. To facilitate this, all 
articles are listed below by classification letter. 


A. Surveys and Discussions (39) 


Anscombe (1948); Birnbaum (1953); Bradley (1953); Bradley and Duncan 
(1950); Bradley and Terry (1951a); Camp (1942); Cantelli (1928); Cochran 
(1952); C. C. Craig (1942); Dixon and Massey (1951); Godwin (1944); Haldane 
(1949); Hemelrijk (1952a); Hemelrijk et al. (1951); Hoeffding (1948a, 1952a); 
Janko (1950); Kendall (1948a); Kolmogorov and Hinéin (1951); Lehmann 
(1951); Moran (1950a); Moses (1952a); Mosteller (1941, 1953); Narumi (1923); 
E. 8. Pearson (1937, 1938b, 1947); Pitman (1948); Rényi (1951); Scheffé (1943b) ; 
Tippett (1952); Wallis (1952); Walter (1951); Wilks (1943, 1948); Wolfowitz 
(1943, 1949, 1952). 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 899 


B. Theory (31) 


Berger (1951); Birnbaum (1953); Camp (1946); Feller (1938); Fraser (1953b) ; 
Halmos (1946); Hodges and Lehmann (1950); Hoeffding (1948a, 1951b, 1952a, 
1952b); Lehmann (1951, 1953); Lehmann and Stein (1949); Levene (i952); 
Madow (1948); Neyman (1942, 1949); Noether (1949, 1950); E. S. Pearson 
(1938a); Pitman (1948); Scheffé (1943a); Scheffé and Tukey (1945); Schmidt 
(1934); P. B. Simpson (1951); Wald and Wolfowitz (1944); Walsh (1949d, 
1951b); Wilks (1948); Wolfowitz (1942). 


C. Tchebycheff Inequalities (94) 


O. N. Anderson (1935); Barnard (1943b, 1944); Berge (1932, 1937); Bernstein 
(1924, 1927, 1937); Bienaymé (1853, 1867); Bignardi (1947a, 1947b); Birnbaum 
(1948); Birnbaum and Raymond and Zuckerman (1947); Camp (1922, 1923, 
1948); Cantelli (1910, 1911, 1928); Chapelon (1937); C. C. Craig (1933); Cra- 
mér (1946); Curtiss (1950); van Dantzig (1951b); Doob (1951, 1953); Faber 
(1922); Fréchet (1931, 1950); Frisch (1926); Gauss (1821); Godwin (1944); 
Guldberg (1922); Guttman (1948a, 1948b); Halmos (1950); Herdan (1949, 1950, 
1953); de Jongh (1941); Khamis (1950, 1952); Kolmogorov (1928, 1933a, 1950) ; 
C. E. V. Lesser (1942); Lévy (1936); Loéve (1945); Lurquin (1922); Markoff 
(1884a, 1884b, 1924); Masuyama (1942); Medolaghi (1909); Meidell (1921, 
1922, 1923); Midzuno (1950); von Mises (1931a, 1938, 1939b); Miinzner (1950); 
Narumi (1923); Offord (1945); K. Pearson (1919); Peeks (1933); Perks (1947); 
Possé (1886) ; Robbins (1948) ; Romanovski (1940); Royden (1952) ; Sealy (1943) ; 
Selberg (1940a, 1940b, 1940c, 1942); Shohat and Tamarkin (1943); Slutsky 
(1925); C. D. Smith (1930, 1939); Stieltjes (1884, 1894); Tchebycheff (1867, 
1874, 1890, 1899); Theil (1949) Uspensky (1937); Wald (1938, 1939); Wilson 


(1952); Winsten (1946); Zelen (1951). 


D. Tolerance Sets (21) 


Birnbaum and Zuckerman (1949); Fraser (1951, 1953a); Fraser and Worm- 
leighton (1951); Goodman (1953); Gumbel and Schelling (1950); Hemelrijk 
(1949b); Hemelrijk, et al. (1951); Murphy (1948); Noether (1951); Paulson 
(1943); Robbins (1944a); Scheffé and Tukey (1944, 1945); Schiitzenberger 
(1948c); Tukey (1947, 1948a); Wald (1943); Wilks (1941, 1942); Wilson (1952). 


E. Goodness of Fit (122) 


O. N. Anderson (1935); T. W. Anderson and Darling (1952); Azorfn and Wold 
(1950); Berkson (1938); Berkson and Geary (1941); Birnbaum (1950, 1952, 
1953); Birnbaum and Tingey (1951); Borsting (1952); Brownlee (1924); Camp 
(1938) ; Carroll and Bennett (1950); Chung (1949) ; Cochran (1937a, 1942, 1952) ; 
Cramér (1928b); F. N. David (1934, 1939, 1947a, 1948, 1950a, 1950b); F. N. 
David and Johnson (1948); Donsker (1952); Doob (1949); DuBois (1935); 
Edwards (1950); Eisenhart (1935, 1937, 1938); Eisenhart and Wilson (1943); 
Elderton (1901, 1927); Feller (1948); Feuell and Rybicka (1951); Fisher (1923, 
1924, 1928, 1950); Fiske and Dunlap (1945); Fraser (1950); Fry (1938); Geary 
(1947); Glivenko (1933); Griineberg and Haldane (1937); Gumbel (1942, 1948); 
Hald and Sinkbaek (1950); Haldane (1937); Hannan (1950); Hemelrijk (1950a, 
1950b, 1950c, 1950d); Hoel (1938); Kac (1949); Kimball (1947); Kolmogorov 





900 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


(1933b, 1941); Lancaster (1950); P. C. V. Lesser (1933); Lewis and Burke (1949, 
1950); Malmaqvist (1950); Maniya (1949); Mann and Wald (1942); Massey 
(1950a, 1950b, 1951a, 1952c); von Mises (1931b, 1947); Moroney (1951); K. R. 
Nair (1937); Neyman (1935, 1937, 1940, 1949); Neyman and Pearson (1928, 
1930); Okamoto (1952); Olmstead (1940); Pastore (1950); Patnaik (1949); E. §. 
Pearson (1938a); K. Pearson (1900, 1922, 1924, 1932a, 1933a, 1933b, 1934); 
Peters (1950) ; Poti (1950); Robinson (1933); Rosenblatt (1952a, 1952b); Sawkins 
(1941); Seal (1947, 1948); Shanawany (1936); Sherman (1950); P. B. Simpson 
(1951); Smirnov (1936, 1937, 1948); P. V. Sukhatme (1938); Tippett (1952); 
Wald and Wolfowitz (1939, 1941); Vora (1951); A. M. Walker (1950, 1952); 
Waschakidse (1938); Williams (1950); Wilson (1952); Wold (1938); Wolfowitz 
(1942); Yule (1922); Yule and Kendall (1950). 


F. Multisample Problems (54) 


T. W. Anderson and Darling (1952); Bhattacharyya (1943, 1946); Bowker 
(1944); van Dantzig (195la); Dixon (1940); Doob (1949); Drion (1952); Feller 
(1948); R. S. Gardner (1950); Gihman (1952); Gnedenko (1952); Gnedenko 
and Korolyuk (1951); Gnedenko and Mihalevit (1952a, 1952b); Gnedenko and 
Rvateva (1952); Hemelrijk (1950a, 1950b, 1950c, 1950d, 1951); Hemelrijk, et 
al. (1951); Kolmogorov (1941); Krishna Iyer (195la, 1952b); Kunisawa, Ma- 
kabe and Morimura (1951); Kvit (1950); Lal (1952); Lehmann (1951); Marshall 
(1951); Massey (1951a, 1951b, 1951c, 1952b); Mathen (1946); Mathisen (1943); 
Mihalevit (1952); Neyman and Pearson (1928, 1930); Nybolle (1936); K. Pear- 
son (1911, 1932b); Petrov (1951); Rhodes (1924); Smirnov (1939a, 1939b, 1948); 
Tippett (1952); Tsao (1952); van der Vaart (1950a); Wald and Wolfowitz 
(1940, 1941); Wolfowitz (1942, 1949). 


G. Parameter Problems (1386) 


Armitage (1944); Barnard (1943a); Blomqvist (1951); Bradley (1952b); 
Bradley and Terry (195lb, 1952a, 1952b, 1952c); G. W. Brown and Mood 
(1951); Cochran (1937b, 1943, 1950); L. C. Cole (1945); F. N. David and John- 
son (195la, 1951b, 1951c, 1952); H. A. David (1951); Dixon (1952); Dixon and 
Mood (1946); Drion (1951) Durbin (1951) Dwass (1952); Eddison, et al. (1951); 
Ehrenberg (1951); Eisenhart (1947); Epstein (1953); Evans (1942); Festinger 
(1943, 1946); Fisher (1948, 1949); Fix and Hodges (1951, 1952); Fraser (1952); 
Freeman and Halton (1951); Friedman (1937); Gayen (1950a, 1950b); Guttman 
(1948b); Halmos (1946); Hartley (1950a, 1950b); Hayashi (1950); Hemelrijk 
(1950b, 1950c, 1950d, 1952b, 1952c); Hemelrijk, et al. (1951); Hodges and Leh- 
mann (1950); Jeeves and Richards (1950); Kemperman (1950); Kempthorne 
(1952); Kruskal (1952); Kruskal and Wallis (1952); Lehmann (1953); Lloyd 
(1952); Mann and Whitney (1947) ; Marshall (1951); Marshall and Walsh (1950); 
Massey (1952a, 1952d); Mathematical Centre (1952); Mood (1950); Moroney 
(1951); Moses (1952b); Mosteller (1948); Mosteller and Tukey (1950); A. N. K. 
Nair (1941); K. R. Nair (1940a, 1940b, 1948a); Noether (1948, 1951); E. S. 
Pearson (1931, 1950b); Pillai (1951); Pitman (1937a, 1937c, 1939); Rijkoort 
(1952); Savur (1937, 1938); Schultz (1945); Snedecor (1953); Stevens (1948, 
1951); Stewart (1941); Stuart (1951); Swineford (1946); Terry (1952); Thomp- 
son (1936); Tippett (1952); Tukey (1949b, 1949c, 1950); Tweedie (1953); van 
der Vaart (1950b); van der Waerden (1952, 1953); Wallis (1939); Walsh (1946a, 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 901 


1946c, 1946d, 1947, 1949a, 1949b, 1949c, 1950a, 1950b, 1950c, 1951a, 1951b, 
1952a, 1952b, 1953); Welch (1937, 1938); Westenberg (1948, 1950a, 1950b, 1952) ; 
White (1952); Whitney (1948, 1951); Wilcoxon (1945, 1946, 1947, 1949, 1950); 
Wilks (1940); Wilson (1952); Woodruff (1952); Yates (1951); Zubin (1939). 


H. Contingency Tables (76) 


Adler (1951); Allinson and Bates (1944); Barnard (1945, 1947a, 1947b); 
Bonnier (1942); Carroll and Bennett (1950) ; Cochran (1952); C. C. Craig (1953); 
Crow (1952); Edwards (1950); Eisenhart (1935); Federighi (1950); Finney 
(1948); Fisher (1922, 1926a, 1935, 1941, 1945); Freeman and Halton (1951); 
Fulcher (1942); Gebelein (1942); Geppert (1944); Gildemeister and Waerden 
(1943); Goodman and Kruskal (1953); Haldane (1939, 1940); Irwin (1935, 
1949); Irwin and Snedecor (1933); Jeffreys (1937); Jurgensen (1947); Kendall 
(1948b); Kermack and McKendrick (1940); Kondo (1939); Lancaster (1949, 
1951); Leslie (1951); Lewis and Burke (1949); Lombard and Doering (1947); 
Mainland (1948, 1952); Mainland and Murray (1952); Mood (1949); Moroney 
(1951); Neyman and Pearson (1928); Patnaik (1948); E. S. Pearson (1947); 
E. 8. Pearson and Merrington (1948); K. Pearson (1916a); K. Pearson and 
Heron (1913); Pompilj (1952); Rao (1952); Sillitto (1949); B. H. Simpson 
(1951); Skory (1952); C. A. B. Smith (1951); C. D. Smith (1951); Snedecor and 
Irwin (1933); Stevens (1938, 1951); Swaroop (1938); Swineford (1948); Tocher 
(1950) ; Wilks (1935); Wilson (1941, 1942); Wilson and Worcester (1942a, 1942b) ; 
Yates (1934, 1948); Youden (1950); Yule (1912, 1922); Yule and Kendall 
(1950). 


I, Randomness (109) 


Armitage, Baines and Lindley (1944); Bartlett (1951, 1952); Bateman (1948); 
B. M. Bennett (1952); C. A. Bennett (1951); J. Bertrand (1875); Besson (1920); 
Bienaymé (1875); Bilham (1926); Borel (1933); Bose (1946, 1950); B. Brown 
(1948); Cameron (1952); Campbell (1942); Cochran (1936, 1938); Cowles and | 
Jones (1937); Dantzig (1939); F. N. David (1947b); H. T. Davis (1941); Dodd 
(1939, 1941, 1942); Eisenhart and Wilson (1943); Elfving and Whitlock (1950); 
Finney (1942, 1947); Fisher (1926b); Freund (1951); Gage (1943); Ghosh (1948); 
Gleissberg (1945a, 1945b); Gold (1929); Good (1953); Goodman (1952); Grant 
(1952); J. A. Greenwood (1946); Hald (1952); Haldane and Smith (1948); 
Housner and Brennan (1948); N. L. Johnson (1948); H. E. Jones (1937); 
Juncosa (1949); Keeping (1952) ; Kendall and Smith (1938, 1939a, 1939b); 
Kermack and McKendrick (1937a, 1937b); Krishna Iyer (1948b, 1952a); Kuz- 
nets (1929); Levene (1946a, 1946b, 1952); Levene and Wolfowitz (1944); Lowry 
(1951); Mahalanobis (1944); Mann (1945a, 1945b, 1950); Marbe (1926, 1934); 
Mihoc (1943); G. H. Moore and Wallis (1943); P. G. Moore (1949); Moran 
(1947a, 1948c, 195la); Mosteller (1941); K. R. Nair (1938); Noether (1950); 
Olds (1949); Olmstead (1942, 1946); van der Plank (1947); Rosander (1942); 
von Schelling (1939); Shewhart (1941); Silberstein (1945); Singh (1952); 
Stevens (1937); Stuart (1952); B. V. Sukhatme (1949a, 1949b); Terpstra 
(1952a, 1952b); M. Thomas (1951); Tintner (1952); Tippett (1927); Todd 
(1940); Ville (1943a, 1943b); Wald and Wolfowitz (1943); Waldapfel (1943); 
A. M. Walker (1950, 1952); H. M. Walker (1929); Wallis and Moore (1941a, 
1941b); Wilson (1952); Wold (1948); Wolfowitz (1942); Young (1941); Yule 
(1938); Yule and Kendall (1950). 





902 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


J. Correlation and Curve Fitting (96) 


Bartlett (1949); Blomqvist (1950); A. Brown and Stanley (n.d.); Chown and 
Moran (1951); Cochran (1937b); Cramér (1924); Crist (1940); Cronbach and 
Glosser (1952); Daniels (1943, 1950, 1951); Daniels and Kendall (1947); F. N, 
David (1950c); 8. T. David, Kendall and Stuart (1951); Deming (1938); Drion 
(1951); DuBois (1939); Durbin and Stuart (1951); Eells (1929); Esscher (1924); 
Gayen (1951); Geary (1953); Guilford (1941); Hemelrijk (1949a, 1949b, 1950e); 
Hemelrijk, et al. (1951); Hoeffding (1940, 1941, 1942, 1947, 1948b, 1952b); 
Horn (1942); Hotelling (1940); Hotelling and Pabst (1936); Judd (1936); 
Kaarsemaker and Wijngaarden (1952); Kendall (1938, 1942b, 1945, 1947, 
1948a, 1948b, 1949); Kendall, Kendall and Smith (1938); Lehmann (1953); 
Lindeberg (1925, 1929); Lyerly (1952); McNemar (1947); Mood (1950); Moran 
(1948a, 1948b, 1950a, 1950b, 1951b); Moroney (1951); K. R. Nair and Banerjee 
(1942); K. R. Nair and Shrivastava (1942); Neyman (1951); Neyman and Scott 
(1951, 1952); Nimeroff (1952); Norton (1946); Olds (1938a); Olmstead and 
Tukey (1947); K. Pearson (1916b); Pitman (1937b); Pridmore (1944); Quenou- 
ille (1952); Schultz (1945); Schiitzenberger (1948a); Scott (1950, 195i); Sillitto 
(1947); Silverstone (1950); B. B. Smith (1941); Spearman (1904); Spurr (1951); 
Steffensen (1941); “Student” (1921); Theil (1950a, 1950b, 1951); Thornton 
(1943); Tippett (1952); Treloar (1942); Wald (1940); Weichelt (1946); Whit- 
field (1947, 1949, 1950); Wolfowitz (1943); Woodbury (1940); Yule and Kendall 
(1950). 


K. Comparative Studies (44) 


Baker (1946); Bartlett (1935); Benson (1949); Bradley (1952a, 1952b); Camp 
(1946); Chung (1946); Cochran (1936); F. N. David (1949); F. N. David and 
Johnson (195la, 1951b, 195lc, 1952); Davies and Pearson (1934); Eden and 
Yates (1933); Eisenhart (1947); Festinger (1943); Finch (1950); Gayen (1949, 
1950b); Geary (1936, 1947); Godwin (1949a); Hastings, et al. (1947), Hirsch- 
mann (1943); Hotelling (1947); H. L. Jones (1953); Laderman (1939); Moriguti 
(1951); A. N. K. Nair (1941); K. R. Nair (1950); E. S. Pearson (1931, 1937, 
1938b, 1950b); E. S. Pearson and Adyanthiaya (1929); E. S. Pearson and Mer- 
rington (1951); Perlo (1933); Plackett (1947); Rider (1929); Shone (1949); 
Steinhause (1948); Tukey (1948b, 1950). 


L. Systematic Statistics (127) 


Baker (1946); Banerjee (1952); Barnard (1943a); Belz and Hooke (1953) ; Ben- 
derskii (1952); C. A. Bennett (1952, 1953); Benson (1949); Bhate (1951); Burr 
(1952); Cadwell (1952); Chandler (1952); Chandra Sekar and Francis (1941); 
Chaplin (1880, 1882); Cochran (1941); R. H. Cole (1949, 1951); Cox (1948); 
A. T. Craig (1932); Cramér (1946); Daly (1946); Darling (1952a, 1952b); H. A. 
David (1951); Davies and Pearson (1934); Dodd (1923); DuBois (1935); Dwass 
(1952); Egudin (1947); Eisenhart, Deming and Martin (1948); Epstein (1948, 
1949, 1951, 1952); Epstein and Sobel (1952a, 1952b); Feller (1951); Fisher and 
Tippett (1928); Fisher and Yates (1948); Fréchet (1927); Gartsteln (1948); 
Gnedenko (1943); Godwin (1948, 1949b); Griffith (1920); Gumbel (1935, 1944, 
1946, 1947, 1949); Gumbel and Keeney (1950a, 1950b) ; Gumbel and Greenwood 
(1951); Hald (1952); Harris (1952); Hartley (1942, 1950a, 1950b); Hartley and 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 903 


Pearson (1951); Hastings, et al. (1947); Hayashi (1950); Hoeffding (1953) ; Hojo 
(1932, 1933); Homma (1951); Irick (1952); A. E. Jones (1946); H. L. Jones 
(1948); Kawata (1951); Keen, Page and Hartley (1953); Kendall (1940); Lord 
(1947); May (1952); McIntyre (1952); McKay (1935); McMillan (1949); 
Melzler (1949, 1950); von Mises (1936); Mood (1941); Moriguti (1951, 1952b); 
Moshman (1952); Mosteller (1946); K. R. Nair (1948a, 1948b, 1952); K. R. 
Nair and Shrivastava (1942); Ogawa (1951); Olds (1935); Patnaik (1950); 
E. S. Pearson (1932, 1937, 1938b, 1952); E. 8. Pearson and Haines (1935); 
E. S. Pearson and Hartley (1943); K. Pearson (1907, 1920); Peirce (1926); 
Pillai (1948, 1950, 1951, 1952); Pillai and Ramachandran (1953); Rider (1950, 
195la, 1951b, 1953); Sandelius (1952); Schiitzenberger (1948b, 1948c); Shone 
(1949); Sillitto (1951); Smirnov (1935, 1949); Terpstra (1952b); H. A. Thomas 
(1948); Thompson (1938); Tukey (1948c); Walsh (1946a, 1946b); Watson 
(1952); Weichelt (1946); Woodruff (1952); Yamanouchi (1949). 


M. Scaling (37) 


Baten (1946); Baten and Trout (1946); Bliss, Anderson and Marland (1943) ; 
Bradley (1953); Bradley and Duncan (1950); Bradley and Terry (1951a, 1951b); 
Chapman (1934, 1935); Cramér (1946); Crist and Seaton (1941); Durbin (1951); 
Eddison, et al. (1951); Ehrenberg (1952); Eysenck (1939); Friedman (1940); 
J. A. Greenwood (1943); M. L. Greenwood and Salerno (1949); Greville (1941) ; 
Grossnickle (1942); Guttman (1946); van der Heiden (1952); P. O. Johnson 
(1950) ; Judd (1936); Kendall (1942a, 1948a); Kendall and Smith (1939c, 1939d) ; 
Moran (1947b); Mosteller (195la, 1951b, 1951c); Scheffé (1952); Schuyler 
(1948); Stouffer, et al. (1950); Stuart (1951); Terry, Bradley and Davis (1952). 


N. Distribution Theory (383) 


Andersen (1949); T. W. Anderson (1943); T. W. Anderson and Darling (1952) ; 
André (1879, 1883a, 1883b); Arfwedson (1951); Armitage, Baines and Lindley 
(1944); Bachelier (1912); Banerjee (1952); Bateman (1948); Baticle (1951); Bat- 
tin (1942); Benderskii (1952); Belz and Hooke (1953); Bendersky (1948); 
Bergstrém (1949, 1951); Berry (1941); Bertrand (1875, 1907); Besson (1920); 
Bickerstaff (1947); Bienaymé (1875); Bilham (1926); Birnbaum (1948, 1950); 
Birnbaum and Tingey (1951); Blomqvist (1950, 1951); Borel (1933); Bose 
(1946, 1950); Bottema and van Veen (1943, 1946); G. W. Brown and Mood 
(1951); Burr (1952); Cadwell (1952); Carlton (1946); Chandler (1952); Chandra 
Sekar and Francis (1941); Chapman (1934, 1935); Chung (1948, 1949); Chung 
and Feller (1949); Clark (1933); Cochran (1938, 1942, 1950); Cramér (1928a, 
1928b, 1946); Daniels (1951); Daniels and Kendall (1947); Dantzig (1939); 
Darling (1951, 1952a, 1952b); F. N. David (1934, 1938, 1939, 1947a, 1947b, 
1948); F. N. David and Johnson (1948, 195la, 1951b, 1952); S. T. David, 
Kendall and Stuart (1951); Dixon (1940); Dodd (1923); Domb (1947, 1952); 
Donsker (1952); Doob (1949, 1953); Drion (1952); Dubin and Stuart (1951); 
Eggleton and Kermack (1944); Ehrenberg (1952); Eisenhart (1937, 1938); 
Eisenhart, Deming and Martin (1948); Epstein (1949); Erdés and Kac (1947) ; 
Esscher (1924); Essen (1942); Evans (1942); Federighi (1950); Feller (1945, 
1948, 1950, 1951); Finney (1947); Fisher (1922, 1923, 1925, 1926a, 1926b, 1928) ; 
Fisher and Tippett (1928); Fisher and Yates (1948); Fraser (1950); Fréchet 
(1927); Freund (1951); Friedman (1937, 1940); A. Gardner (1952); R. S. Gardner 





904 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER i933 


(1950); Gartstein (1948); Gayen (1951); Geary (1936); Ghosh (1948); Gilde. 
meister and Waerden (1943); Gleissberg (1945a, 1947); Gnedenko (1943, 1952); 
Gnedenko and Korolyuk (1951); Gnedenko and Mihalevit (1952a, 1952b); 
Gnedenko and Rvacéva (1952); Godwin (1948); Gold (1929); Gontcharoff 
(1942, 1943, 1944); Good (1953); Goodman (1952); Goudsmit (1945); Grant 
(1952); J. A. Greenwood (1938, 1940, 1946); R. E. Greenwood (1953); Greville 
(1938, 1941, 1944); Gumbel (1935, 1944, 1946, 1947, 1949); Gumbel and Green- 
wood (1951); Gumbel and Keeney (1950a, 1950b); Gumbel and Schelling (1950); 
Gupta (1950); Haden (1947); Hadwiger (1943, 1946); Haldane (1937, 1939, 
1940); Haldane and Smith (1948); Hannan (1950); Harris (1952); Hartley (1942, 
1950a, 1950b); Hastings et al. (1947); Hayashi (1950); Hemelrijk (1952c); van 
der Heiden (1952); Hoeffding (1947, 1948a, 1948b, 1951a, 1953); Hoeffding and 
Robbins (1948); Hoel (1938); Hojo (1932); Hotelling and Pabst (1936); Hous- 
ner and Brennan (1948); Hsu (1945); Huntington (1937); Irick (1952); Ising 
(1925); N. L. Johnson (1948); Juncosa (1949); Kac (1949); Kanellos (1948); 
Kaplan (1949); Kaplansky (1945a); Kaplansky and Riordan (1945, 1946); Katz 
(1952); Kawata (1951); Keeping (1952); Kemperman (1950); Kendall (1938, 
1941, 1942a, 1942b, 1945, 1947, 1948a, 1949); Kenaali, Kendall and Smith 
(1938); Kendall and Smith, 1939c, 1939d); Kermack and McKendrick (1937a, 
1937b, 1938); Kimball (1947, 1950); Kolmogorov (1933b, 1941); Krishna Iyer 
(1947, 1948a, 1948b, 1949a, 1949b, 1950a, 1950b, 1950c, 1951b, 1952a, 1952b); 
Krishna Iyer and Sukhatme (1949); Kruskal (1952); Kullback (1939); Kuznets 
(1929); Levene (1946b, 1952); Levene and Wolfowitz (1944); Lévy (1936); 
Loéve (1945, 1950); Lyerly (1952); Mack (1948); MacMahon (1915, 1916); 
Madow (1948); Mahalanobis (1944); Malmqvist (1950); Maniya (1949); Mann 
(1945a, 1945b); Mann and Whitney (1947); Marbe (1926, 1934); Marshall 
(1951); Massey (1950b, 1951b, 1951c); Masuyama (1951); Mathisen (1943); 
Mauldon (1951); McCarthy (1947); McKay (1935); McMillan (1949); Melizler 
(1950); Mihalevit (1952); Mihoc (1943, 1949); von Mises (1936, 1947); Mood 
1940, 1941, 1950); G. H. Moore and Wallis (1943); P. G. Moore (1949); Moran 
(1947a, 1947b, 1947c, 1948a, 1948b, 1948c, 1950b, 1951la, 1951b); Moriguti 
(1951); Mosteller (1941, 1946, 1948); Mosteller and Tukey (1949, 1950); Mul- 
lemeister (1945); A. N. K. Nair (1942); K. R. Nair (1937, 1940a, 1940b, 1948a, 
1948b); Neyman (1935, 1937); Noether (1949, 1950); Olds (1935, 1938b, 1949, 
1952); Olmstead (1946); Olmstead and Tukey (1947); Patnaik (1949); E. 8. 
Pearson (1952); E. S. Pearson and Hartley (1943); K. Pearson (1900, 1911, 
1920, 1933b); Pillai (1952); Pitman (1937c); van der Plank (1947); Pollaczek 
(1952); Rajalakshma (1943); Rider (1950, 195la, 1951b); Robbins (1944b, 
1945); Rosenblatt (1952b); Sakamoto (1943); Sandelius (1952); Santalé (1947); 
Sawkins (1947); von Schelling (1939); Schrutka (1941); Schiitzenberger (1948a, 
1948b); Schuyler (1948); Seal (1948); Shanawany (1936); Sherman (1950); 
Shone (1949); Silberstein (1945); Sillitto (1947); Silverstone (1950); Singh 
(1952); Smirnov (1935, 1936, 1937, 1939a, 1939b, 1947, 1949); Stevens (1937, 
1938, 1951); Stewart (1941); Stuart (1950, 1951, 1952); “Student” (1921); 
B. V. Sukhatme (1949a, 1949b, 1951); Terpstra (1952a, 1952b); Thompson 
(1936, 1938); Tukey (1948c, 1949a, 1949c); Ville (1943a); Vora (1951); Votaw 
(1946) ; van der Waerden (1952); Wald (1947); Wald and Wolfowitz (1940, 1943, 
1944); Waldapfel (1943); H. M. Walker (1929); Wallis (1939); Wallis and Moore 
(1941a, 1941b); Walsh (1949a); Waschakidse (1938); Welch (1937, 1938); White 





BIBLIOGRAPHY OF NONPARAMETRIC STATISTICS 905 


(1952); Whitworth (1945); Wilks (1935, 1943); Wishart and Hirschfeld (1936); 
Wolfowitz (1942, 1944a, 1944b); Woodbury (1940); Young (1941). 


0. Applications (89) 


Azorin and Wold (1950); Baillie (1946); Baker and Guilbert (1942); Barnard 
(1943a, 1943b) ; Bates and Neyman (1951); C. A. Bennett (1951); Bilham (1926); 
B. Brown (1948); W. R. J. Brown (1952); Brownlee (1924); Cameron (1952) ; 
Campbell (1942); Chaplin (1880, 1882); Charley (1950, 1952); Clark (1933, 
1934); Cochran (1936, 1938); L. C. Cole (1945); Cowles and Jones (1937); 
Cramér (1928b); Crist (1940); Crist and Seaton (1941); van Dantzig (1952); 
D. G. Davis (1952); Dawson, Duehring and Parks (1947); Eisenhart (1935); 
Eisenhart and Wilson (1943); Elfving and Whitlock (1950); Epstein (1948); 
Esscher (1924); Eysenck (1939); Fiedler, Hartman and Rudin (1952); Gage 
(1943); Gold (1929); M. L. Greenwood and Salerno (1949); Griffith (1920); 
Grineberg and Haldane (1937); Guttman (1946); Hald (1952); Haldane and 
Smith (1948); Herdan (1949, 1950); H. E. Jones (1937); Judd (1936); Keeping 
(1952); Keen, Page and Hartley (1953); Kendall and Smith (1938, 1939b); 
Kermack and McKenderick (1937a); P. C. V. Lesser (1933); Lewis and Burke 
(1949); Lollar (1952); Lombard and Doering (1947); Lowry (1951); Mainland 
(1948); Moses (1952a); Mosteller (1941); K. R. Nair (1938); Nimeroff (1952); 
Nybdlle (1936); Olmstead (1940); K. Pearson (1900, 1911, 1932a); K. Pearson 
and Heron (1913); Peirce (1926); Pridmore (1944); Reynolds (1952); Scheffé 
(1952); Schultz (1945); Schiitzenberger (1948b, 1948c); Sealy (1943); Shewhart 
(1931, 1941); Stouffer, et al. (1950); Terry, Bradley and Davis (1952); H. A. 
Thomas (1948); Tintner (1952); Uhler (1952); White (1952); Whitfield (1950); 
Wilcoxon (1946); Yule (1922, 1938). 


P. Tables (228) 


O. N. Anderson (1935); T. W. Anderson and Darling (1952); Azorin and Wold 
(1950); Baker (1946); Bateman (1948); C. A. Bennett (1952); Berge (1932, 
1937); Bickerstaff (1947); Birnbaum (1952); Birnbaum and Tingey (1951); 
Blomqvist (1950, 1951); Cadwell (1952); Camp (1922, 1948); Chandler (1952) ; 
Chapman (1935); Clopper and Pearson (1934); L. C. Cole (1945); R. H. Cole 
(1949) ; Cox (1948); Daly (1946); F. N. David (1934, 1939, 1947a, 1947b, 1949, 
1950a); H. A. David (1951); S. T. David, Kendall and Stuart (1951); Dixon 
(1940); Dixon and Mood (1946); Dodd (1923); Drion (1952); DuBois (1939); 
Duncan (1952); Elderton (1901); Epstein (1952); Epstein and Sobel (1952a); 
Eysenck (1939); Festinger (1946); Finch (1950); Finney (1948); Fisher (1950); 
Fisher and Tippett (1928); Fisher and Yates (1948); Fix (1949): Fix and 
Hodges (1952); Friedman (1937, 1940); Fulcher (1942); A. Gardner (1952); 
Gayen (1949, 1950a, 1950b, 1951); Geary (1947); Gleissberg (1945a); God- 
win (1949b); Gordon et al. (1952); Grant (1952); R. E. Greenwood (1953); 
Guilford (1941); Gumbel (1935, 1947, 1949); Gumbel and Greenwood (1951); 
Gumbel and Keeney (1950a, 1950b); Gumbel and Schelling (1950); Gupta 
(1950); Hald and Sinbaek (1950); Haldane and Smith (1948); Harris (1952); 
Hartley (1950a, 1950b); Hartley and Pearson (1951); Hastings, et al. (1947); 
Hemelrijk, et al. (1951); Hoeffding (1948b); Hojo (1933); Huntington (1937); 
A. E. Jones (1946); H. L. Jones (1948, 1953); Jurgensen (1947); Kaarsemaker 
and Wijngaarden (1952); Kaplansky (1945a); Katz (1952); Keeping (1952); 





906 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Kendall (1938, 1948a); Kendall, Kendall and Smith (1938); Kendall and Smith 
(1939a, 1939c, 1939d); Kolmogorov (1941); Krishna Iyer (1950a); Kruskal and 
Wallis (1952); Kunisawa, Makabe and Morimura (1951); C. E. V. Lesser (1942); 
Levene (1952); Lord (1947); Lyerly (1952); Mahalauobis, et al. (1933); Main- 
land (1948, 1952); Mainland and Murray (1952); Mann (1950); Mann and 
Whitney (1947); Massey (1950b, 195la, 1951b, 1952b); Mathisen (1943); 
Mathematical Centre (1952); May (1952); McCarthy (1947); McIntyre (1952); 
McKay (1935); G. H. Moore and Wallis (1943); G. H. Moran (1948b, 1950b, 
1951b); Moriguti (1951, 1952a, 1953); Moshman (1952); Moses (1952); Mostel- 
ler (1941, 1946, 1948); Mosteller and Tukey (1950); Murphy (1948); K. R. Nair 
(1940b, 1948a, 1948b, 1952); Neyman and Pearson (1930); Norton (1946); 
Ogawa (1951); Olds (1935, 1938a, 1949); Olmstead (1946); Olmstead and Tukey 
(1947); Patnaik (1948, 1949, 1950); E. S. Pearson (1932, 1947); E. S. Pearson 
and Hartley (1943); E. S. Pearson and Merrington (1948, 1951); K. Pearson 
(1900, 1919, 1920, 1931, 1933b, 1934); Pillai (1950, 1951, 1952); Pitman (1937c); 
Plackett (1947); Rider (1929, 1951b); Rijkoort (1952); Schmidt (1934); Schiit- 
zenberger (1948c); Shanawany (1936); Sillitto (1947, 1949, 1951); Silberstone 
(1950); Smirnov (1947, 1948); Stevens (1937); Stewart (1941); Stuart (1952); 
B. V. Sukhatme (1951); P. V. Sukhatme (1938); Swaroop (1938); Swed and 
Eisenhart (1943); Swineford (1946, 1948); Terry (1952); H. A. Thomas (1948); 
M. Thomas (1951); Thornton (1943); Tippett (1927); Tukey (1948c, 1949b, 
1949c); van der Vaart (1950b); van der Waerden (1952); Wald (1943); Wallis 
and Moore (194la, 1941b); Walsh (1946a, 1946d, 1949a, 1949b, 1949c, 1950b, 
1950c, 1951b); Welch (1937, 1938); Westenberg (1948, 1950a); White (1952); 
Whitfield (1949); Whitney (1951); van Wijngarden (1950); Wilcoxon (1945, 
1946, 1947, 1949); Wilks (1940, 1942); Williams (1950); Wilson (1952); Winsten 
(1946); Woodbury (1940); Young (1941); Zubin (1939). 


X. Miscellaneous (28) 


Bejar (1952); Birnbaum (1948); Birnbaum and Zuckerman (1944); Chakra- 
barti (1946, 1947); Dyson (1943); Fréchet (1947); Gordon, et al. (1952); Gutt- 
man (1948a); Herdan (1953); Hornich (1941); Kaplansky (1945b); Kerawala 
(1948); Masuyama (1951); von Mises (1939a); Moriguti (1952a, 1953); Mostel- 
ler and Tukey (1949); E. S. Pearson (1950a); Picard (1951); Plackett (1947); 
Reiersol (1944); Shohat (1929); Smirnov (1947); Tukey (1946); Wallis (1942); 
Wilkins (1944); Winsten (1946). 





ERRATA 


Readers and authors are invited to submit corrections to papers pub- 
lished in any previous issue. These will be published each year, in the 
December issue. 


Hyrenius, Hannes, ON THE UsrE or RanGEs, Cross-RANGES AND Ex- 
TREMES IN COMPARING SMALL SAMPLES, Vol. 48, No. 263 (September 
1953), 534-45. 

On page 536, in equation (9b), a factor T—* is missing. 

Furthermore, the sum in the equation can be evaluated, simplifying 
the formula to 


(Ni — 1)(Ni — 1)1N2! 
T.(N, + Nz — 1)! 


Kruskal, William H., and Wallis, W. Allen, Use or RANKs IN ONE- 
CRITERION VARIANCE ANALYsIs, Vol. 47, No. 267 (December 1952), 
583-621. 

1. In Section 5.3 of [a] we should have mentioned, had we known of 
it, a 1952 article by van der Reyden [b]. Van der Reyden develops Wil- 
coxon’s two-sample test independently, and tabulates critical values of 
R at two-tail significance levels of 5, 2, and 1 per cent for all sample 
sizes such that 10S N $30 and 2 or 3Sn<12, the lower limit for n be- 
ing 2 at the 5 per cent level and 3 at the other levels. 

Since van der Reyden’s tables for the 5 and 1 per cent levels cover 
much the same ground as White’s [a, Sec. 5.3.5], we have compared the 
two tables wherever possible and have corresponded with van der Rey- 
den, who in turn has corresponded with White, about discrepancies be- 
tween them. The upshot of this correspondence is: (7) There are nu- 
merous and fairly sizeable errors in the columns for n=11 and 12 of 
the three van der Reyden tables; van der Reyden has very kindly 
sent us the corrected values, but these have not yet been published.? 
(it) In addition there are twelve scattered discrepancies, each of one 
unit in R, between the van der Reyden and the White tables at the 5 
and 1 per cent levels; in all of these van der Reyden appears to be cor- 





(9b) f(T) = 





1 For comments embodying or leading to these corrections and additions we are indebted to K. A. 
Brownlee (University of Chicago), P. J. Rijkoort (Royal Netherlands Meteorological Institute), 
L. J. Savage (University of Chicago), T. J. Terpstra (Mathematical Center, Amsterdam), D. van der 
Reyden (Tobacco Research Board of Southern Rhodesia), and C. White (University of Birmingham). 

2 White, in a letter to us, points out that approximately correct values for columns 11 and 12 of 
van der Reyden’s tables may be obtained as follows: move the entries in the Z columns down one row 
to find the approximately correct L values; then to obtain the corresponding U values use the relation- 
ship U=n(m+1)—L (van der Reyden’s notation). This applies to all three levels of significance. 


907 





908 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


rect, White’s values leading to probabilities of Type I error slightly 
greater than intended. 

2. Reference [44] should have been listed as shown at the end of this 
note. Had this been available to us in time we should have included the 
following description in Section 5: 

Rijkoort’s C-Sample Test. Rijkoort [44] has proposed the C-sample 
test which rejects when 


S= dD n*[R: — 3(N + 1)}? 


is large. The use of S is not equivalent to the use of H unless all n,’s are 
equal; in that case the relationship is S= N*(N+1)H/(12C). In Rij- 
koort’s paper k is used for our C, and when all the n,’s are equal m is 
used to denote their common value. 

Rijkoort tabulates the distribution of S for the following cases with 
all n,’s equal: C=3, N=6, 9, and 12; C=4, N=8; C=5, N=10. He 
also gives the upper tails of the distributions of S (down at least to the 
upper 5 per cent points) for C=3, N=15 and for C=4, N=12. In ad- 
dition, he gives approximate upper 5 per cent critical values of S for C 
from 3 through 10, and for (equal) n,’s from 2 through.10. True critical 
values are given in some cases. 

We have compared Rijkoort’s distributions with ours when C=3 
and have found a few discrepancies. Correspondence with Rijkoort 
about these leads to the following corrections to the cumulative prob- 
abilities in Rijkoort’s tables. (We omit corrections of only a single unit 
in the last decimal place.) 


k = 3, m = 2: S = 18, P = 0.467. P should be 0.400. 
= 3,m = 2: S = 14, P = 0.600. P should be 0.533. 
= 3,m = 5: 554 S S S 654. The tabulated P’s are all about 0.002 
too low. 
In addition, Rijkoort has kindly sent us the following corrections to his 
table: 
k=4,m=2:S = 74, P = 0.040. P should be 0.038 
k = 5, m = 2: S = 128, P = 0.0847. P should be 0.0910. 
= 5, m = 2: S = 122, P = 0.1208. P should be 0.1280. 
Finally, in Rijkoort’s table of 5 per cent critical points the number 


pair 558-566 in column 3 and row 5 III should be 566-578. 
3. On p. 587 of [a], in the fourth line after formula (1.5), the word 





ERRATA 909 


“essentially” may be misleading. What we meant to indicate by this 
heuristic statement is that without the factor (VN—1)/N, H would be 
like a sum of squared standardized deviations in which the finite-popu- 
lation corrections to the variances and the correlation between means 
had been disregarded. The factor (N—1)/N is the net result of giving 
due regard to these two points. 

4. In footnote 6, p. 591, it should have been stated that the compari- 
son between use and nonuse of the continuity correction in the two- 
sample test pertains to the one-tail version. For the two-tail version, 
use of the continuity correction is advantageous only when the proba- 
bility is 0.04 or more. 

5. To avoid an ellipsis, the following phrase should be added on 
p. 593, in line 14, just before the semicolon: “thereby altering the value 
of H.” 

6. The errors listed on the next page have been found in Table 6.1, 
most of them as a result of correspondence with T. J. Terpstra.? These 
corrections affect Figure 6.3 of [a] at a few points, but do not change 
the general patterns of deviations between true and approximate prob- 
abilities shown by Figure 6.3. 

7. We take this opportunity to call attention to a paper by Rijkoort 
and Wise [c] which has appeared since [a]. This presents new approxi- 
mations to the sampling distributions for Friedman’s test [a, Sec. 5.2] 
and for the H test [a] if all samples are of the same size (in which case 
the H test and Rijkoort’s test [44] are equivalent). The approximations 
are based upon a series expansion for the inverse of the incomplete 
Beta integral. Nomograms facilitating use of the new approximations 
are given (in both cases) for significance levels from 1 to 10 per cent, for 
3 to 20 samples, for sample sizes of 1 to 30. 

8. We should also like to call attention to a recent paper by van der 
Waerden [d]. In this paper the power of the Wilcoxon test in the normal 
case is discussed, and an alternative nonparametric test is proposed 
that is more powerful than the Wilcoxon test in the normal case. 


REFERENCES 


[2] Kruskal, William H., and Wallis, W. Allen, “Use of ranks in one-criterion 
variance analysis,” Journal of the American Statistical Association, 47 (1952), 
583-621. 

[b] van der Reyden, D., “A simple statistical test,” Rhedesia Agricultural 
Journal, 49 (1952), 96-104. 





2 We are indebted to Jack Nadler for making the computations involved in rechecking Table 6.1. 
Almost all of the errors had occurred at one stage of the computations, and Mr. Nadler recomputed 
this stage completely. 





910 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 
CORRECTIONS TO TABLE 6.1 


In each pair of lines, the first repeats the line from Table 6.1 with the 
erroneous entries italicized, and the second gives the corrections. 








Sample Sizes 


probability 


Approximate minus true 








nN ne ns 


r 
(Linear 
Interp.) 


B 
(Normal 
Interp.) 














> 22. 22 2S 2s 


- 22 3? 





t+ 44 +4 


. 167 
- 100 


.020 
.024 


.012 
.013 


.O11 
.012 


.005 
-004 


.008 





267 
. 209 


.002 
-010 


002 
-004 
-002 


.010 
-009 


.013 
.012 


.003 
-002 


.006 
.004 


.002 








ERRATA 911 


44] Rijkoort, P. J., “A generalization of Wilcoxon’s test,” Proceedings, Kon- 
inklijke Nederlandse Akademie van Wetenschappen, Series A, 55 (1952), 
394-404. 

c] Rijkoort, P. J., and Wise, M. E., “Simple approximations and nomograms 
for two ranking tests,” Proceedings, Koninklijke Nederlandse Akademie van 
Wetenschappen, Series A, 56 (1953), 294-302. 

{d) van der Waerden, B. L., “Order tests for the two-sample problem and their 
power,” Indigationes Mathematicae, 14 (1952), 453-58. 


Rider, Paul R., THe DistrRIBUTION or THE PropucT or RANGES IN 
SAMPLES FROM A RECTANGULAR Popu.ation, Vol. 48, No. 263 (Sep- 
tember 1953), 546-9. 

On page 549, in formula (13) the factor 2 should be removed from the 
denominator. 


Robson, D. S., and King, A. J., MuntipLe SAMPLING oF ATTRIBUTES, 
Vol. 47, No. 258 (June 1952), 203-15. 

The estimate of variance, equation (6), pages 205 and 215, should 
read 


MN m—1 Mn u m —1 
n—m me 


nm ig Miz — 1 





~ 1rM—wN ..om.. N- {-0M,. 
re = —| ste a OD 3 





A proof of the unbiasedness of this estimator may be deduced from the 
appendix by noting, in addition, that 


Dwyer, Paul S., and Waugh, Frederick V.,On Errors 1n Matrix IN- 
VERSION, Vol. 48, No. 262 (June 1953), 289-319. 

Dr. W. Duane Evans and Mr. John C. H. Fei have called our atten- 
tion to the need for modifying Section VII of our paper. In that Sec- 
tion of the paper we considered the inversion of a given Leontief ma- 
trix, L =] —A, where each element of A is non-negative. We assumed 
that any element of L might be in error by 100 k per cent, and we pro- 
posed a very simple upper bound to the discrepancies between elements 
of the given matrix and the true matrix. Unfortunately equation (7.5) 
does not provide an upper bound to such discrepancies. 











912 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1943 


As we stated in Section VI, maximum discrepancies in all elements 
of the inverse of the Leontief matrix would occur if each error in the 
given matrix were negative with the absolute value of its bound. Maxi- 
mum discrepancies do not occur with (1—k)(I—A), since while the 
diagonal errors, —kI, would be negative, yet the non-diagonal errors, 
kA, would be positive; but with 


(—-k)I-(l+khA=L—kI +A) =L— Kd. 


The extreme inverse matrix may be obtained by calculating (Z —kH)-, 
Alternatively the extreme error of L~! can be computed from equation 
(3.4) of our paper. This becomes 


D = k(L“HL~)[I + (kHL~) + (KHL)? + --- J 


with HI-!=2L-1—I, 
If the diagonal terms are not subject to error the extreme inverse is 


obtained from J—(1+)A. The discrepancy formula above holds with 
H replaced by A and AL~“'=L-'—I. 


Brown, J. A. C., Houthakker, H. S., and Prais, S. J., ELecrronic Com- 
PUTATION IN Economic Statistics, Vol. 48, No. 263 (September 1958), 
412-28. 

We are indebted to George W. Thomson of the Ethyl Corporation, 
Detroit, for drawing our attention to some errors in the illustrative ex- 
ample quoted in our article. The errors are associated with the con- 
vergence of the iterative process given on p. 416. The value of 3(3 — 5) 
there given is the value for which convergence is effected after two 
iterations, but this does not mark the boundary for one-sided converg- 
ence. The latter value is }. A similar correction should be made on p. 
422. 

There is a further mistake on p. 416 at step (4) where the words “pos- 
itive” and “negative” have to be interchanged; and a corresponding 
change in the interpretation of the function letter En. 

We apologize to readers who may have been confused by these er- 
rors. 





eIrors, 


‘H)-\ 
lation 


BOOK REVIEWS 


Statistics in Psychology and Education. Fourth Edition. H. E. Garrett. New 
York: Longmans, Green and Company, 1953. Pp. xii, 460, $5.00. 


Freperic M. Lorp, Educational Testing Service 


HE fourth edition of this widely used text represents a considerable re- 
y eaters and revision of the previous one. Several recent references are 
listed in footnotes to the text. Material on analysis of variance has been ex- 
panded into a separate chapter, which includes an illustrative example of 
analysis of covariance. Other new materials include a fuller treatment of 
Fisher’s z; methods of drawing a random sample; stratified sampling; the 
fourfold point correlation; factors determining selection of tests in a battery; 
and one-tailed tests of significance. 

The first six chapters will serve as a good text for students with minimal 
mathematical aptitude who are to learn to compute such statistics as means, 
standard deviations, percentiles, normal curve areas, and Pearson correla- 
tion coefficients. For students who wish to go beyond this, a text that is 
more nearly correct and complete in its statements of logical and statistical 
inferences would be preferable, providing it is not beyond the student’s 
intellectual grasp. 

There is little occasion to take serious exception to the material in the 
first six or seven chapters. In the important section on Standards of Accuracy 
in Computation, however, the reader may reach the erroneous conclusion 
(p. 23) that a square root usually has less significant figures than (often one- 
half) the number of significant figures in the number whose square root is 
extracted. (The illustration in the text should be corrected to show that 
159.5600 = 12.631706 (sic) with an error of no more than .0000022.) 

A very worthwhile achievement of the fourth revision is the removal 
(primarily from chapters 8-10, dealing with sampling, standard errors, and 
testing experimental hypotheses) of the serious confusion, pervading the 
third edition, between a priori and fiducial probability. Only the last sentence 
of the final chapter escaped revision: “This correction gives the value which 
R would most probably take in the population from which our sample was 
drawn.” 

Some of the more serious of the remaining errors and misstatements, 
mostly relating to the logic of statistical inference, are listed below: As a 
criterion for general use in judging randomness of sampling it is suggested 
that “If samples are fairly consistent, therefore, they are presumably random 
unless subsequent examination reveals a common bias.” (p.:205). Also, if 
we can assume the trait to be normally distributed, then “symmetry of dis- 
tribution becomes an excellent criterion of sample adequacy.” (p. 204). 
After making a certain test of the significance of the difference between 
means, “ ,.. we retain the null hypothesis and conclude with confidence 
that, on the evidence, there is no real difference between Norwegians and 


913 











914 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 193 


Belgians on the ‘combined scale’... ” (p. 216). A two-tailed test “should 
always be used when, in accordance with the null hypothesis, our two groups 
have conceivably been drawn from the same population...” (p. 217), 
“Forty-two salesmen have been classified into three groups—very good, 
satisfactory, and poor—by a concensus of sales managers... how many 
of the 42 salesmen may be expected to fall in each category on the hypothesis 
of a normal distribution [may be determined from a table of normal curve 
areas] by dividing the baseline of a normal curve (taken to extend over 6c) 
into 3 equal segments of 2¢ each.” (p. 257). 

A misstatement about statistical technique is that “x? is not stable when 
computed from a table in which any experimental frequency is less than 
5” (p. 258). (It is the theoretical frequencies that are pertinent.) 

The chapter on The Reliability and Validity of Test Scores does not pro- 
vide as clear a discussion of the different kinds of reliability as could be 
desired. Exactly two pages in this chapter, incidentally, are devoted to a 
consideration of item analysis. 

Special favorable mention should be made of the material on Type I and 
Type II errors, and of much of the material on the phi coefficient, on one- 
tailed significance tests, on scaling, and on the multiple correlation coeff- 
cient. 

In the introduction to the first edition in 1926, Woodworth indicates that 
the statistician for whom the book is intended is “he who has selected the 
scientific or practical problem.’... He selects the statistical tools to be 
employed . . . [he] must have a discriminating knowledge of the kit of tools 
which the mathematician has handed him.” In the reviewer’s opinion, what- 
ever may have been the case at the time the foregoing was written, the book 
is not appropriate for today’s statistician who answers to, or today’s student 
who is to be trained to answer to, the foregoing description. The book makes 
no serious effort to specify the assumptions underlying many of the state- 
ments made. In the discussion of regression and prediction, for example, it is 
frequently asserted that the predicted value is the “most likely value” with- 
out any suggestion that normality or some other special property of the 
bivariate distribution is being assumed. Standard error formulas are given 
and their use illustrated often without indicating to the reader that 1) 
normality has been assumed, and 2) the formulas can only be safely used 
with large samples. 

The chapter on Further Methods of Correlation particularly shows the 
tendency to provide a ready recipe for the calculation of any desired statistic, 
without adequately explaining the meaning of the statistic in question. For 
example, it would be useful to point out that the point biserial correlation is 
simply the Pearson product-moment correlation that would be found if 
any two arbitrary numerical values (e.g., 0 and 1, or 7 and 19) were assigned 
to represent the dichotomous variable, and the usual formula for the product- 
moment correlation were then applied. The reader who wishes to apply in 
actual work the techniques of analysis of variance and covariance, the paired 





— — — 2 


—~_~ ff ss 2. wea 6 jg GD (@eceuueel 





% 1953 


ould 
oups 
217), 
‘00d, 
lany 
hesis 
urve 
6c) 


then 
han 





pooK REVIEWS 915 


comparison scaling technique, or the biserial, tetrachoric, or contingency 
correlation methods should refer to some book giving a more thorough treat- 
ment than is possible in a text designed primarily for other purposes. 


Sources of Wage Information: Employer Associations. N. Arnold Tolles and 
Robert L. Raimon (New York State School of Industrial Relations at Cornell 
University, Ithaca, New York). “Cornell Studies in Industrial and Labor 
Relations,” Volume III, Spring 1952, pp. xvii, 351. Paper. $3.00. 


M. I. GersHENSON, California Department of Industrial Relations 


art I of this monograph presents individual digests of most of the wage 
dain conducted by employer associations in the United States. The 
summary descriptions of each of the 220 wage surveys conducted by 120 
employer associations include such information as starting date of the 
survey, how frequently the survey is made, what industries and areas are 
surveyed, the number of plants or companies participating, the sample 
coverage, types of data collected, and types of statistical measures used to 
summarize the information; also, types of “fringe” items, method used to 
collect the original information, and form of publication or distribution of 
the data. 

Unfortunately any ambitious listing such as this, which by the nature of 
the project requires a great deal of time, becomes out of date even before 
it goes to press. No wage figures are given and the authors stress the fact 
that the listing of an association does not necessarily imply that the reader 
may obtain any wage figures from that association. As a reference of avail- 
able source data on wages, the listings are certainly valuable but the mono- 
graph may be frustrating to those seeking specific wage rate information 
since a large number of the entries indicate that the survey results are avail- 
able only to members of the association. 

An alphabetical list of employer associations, a regional finding list, an 
industrial finding list, and a finding list of area-oriented surveys are included 
in Part I together with a technical note on definitions, procedures followed, 
and problems encountered. 

The most valuable contribution to the field is contained in Part II which 
presents a detailed analysis and appraisal of wage surveys conducted by 
employer associations. The authors assess very frankly the elements of 
strengths and weaknesses of existing survey methods and state that this may 
be helpful to employer associations contemplating the conduct of wage 
surveys or seeking to improve their present methods and also to employers 
who seek to interpret and evaluate wage information they receive, to labor 
unions seeking to appraise the validity of the wage information obtained 
through employer association surveys, and to government analysts who may 
need standards for assessing wage information presented to them. Both 
producers and users of wage surveys will find a careful reading of Part II 











916 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


of this monograph very worth while. It contains by far the best practica] 
discussion to date on the methodology of wage surveys. 

Among a number of weaknesses discussed are those of sampling and of 
methods of collection. It is the authors’ conclusion that the employer associa- 
tions appear to give very little attention to the selection of their wage survey 
samples. One may agree that “the objective should be that of securing 4 
representative and balanced sample”, but it should be pointed out that many 
associations have no way of obtaining adequate universe data in terms of 
individual establishments for a given area or industry. 

The authors believe the accuracy of many of the surveys and the uniform. 
ity of their occupational classification are open to question. They point out 
that nine-tenths of the wage surveys of employer associations are based on 
mail questionnaires and that more than half of those which seek occupational 
data solicit the information in terms of mere job titles without any standard- 
ized job descriptions. Attention is directed to the very wide range in wage 
rates for individual occupations which results from such procedures. 

It is implied that in many cases greater accuracy and narrower ranges are 
obtained in surveys where the data are collected directly by field visits using 
carefully prepared occupational descriptions, the procedure used by the U. §. 
Bureau of Labor Statistics. Unfortunately this in itself does not insure nar- 
row ranges, as is demonstrated by the results of the BLS Occupational Wage 
Surveys. 

Wage surveys can certainly be improved by better methods of sampling 
and collection, but it is this reviewer’s opinion that a great deal remains to 
be done in developing means of eliciting more accurate replies from re- 
spondents. Highly developed job descriptions in the hands of trained field 
agents help, but evidence indicates we must still strive to devise more effec- 
tive means of reducing response error even where we have field collection 
and job descriptions. 

The authors touch on this problem and discuss some possible solutions, 
but there is need for much additional work in this aspect of wage surveys. 


Revue de Statistique Appliquée. Volume 1, No. 1, 1953. Paris: Centre de For- 
mation des Ingénieurs et Cadres aux Applications Industrielles de la Statistique, 
Institut de Statistique de l’Université de Paris. Pp. 103. Paper. 


HIS new journal is the organ of the newly formed (1952) statistical center 
whose full title is given above. The objectives of the Centre are stated as 
follows by its general director, M. Georges Darmois: 


We want to make it peosnle for the leaders of French industry to train 
their personnel in the effective use of the statistical techniques which have 
so completely proved themselves in other countries. 

It is necessary on the other hand to pursue research so that new problems 
may be studied and so that a continuing relationship with the users of 
statistics may be maintained. (Page 8.) 








DR 1953 


Ctica] 


nd of 
SOcia- 
urvey 
ing a 
many 
ms of 


form. 
t out 
d on 
ional 
lard- 
wage 


3 are 
sing 
J. §. 
nar- 
vage 


ling 
s to 

re- 
ield 
fec- 
‘ion 


ns, 
ys. 


‘or- 
jue, 


ter 


ain 
ve 


ms 





BOOK REVIEWS 917 


The Centre offers two types of short courses in statistics for industrial 
personnel. The first, lasting from 10 to 15 days, requires no special mathe- 
matical training. The second lasts for three weeks and is designed for engi- 
neers. The first course emphasizes the methods of statistical quality control 
while the second provides a wider coverage of statistical topics. Both are 
oriented toward statistical inference. A detailed outline of contents is given 
in this first issue of the Revue, pp. 16-24. 

The Centre will also maintain a consulting service (Bureau d’Etudes du 
Centre) which will serve the firms and individual engineers who belong to 
the Centre and contribute to its financial support. The Revue, under the 
direction of M. E. Morice, has been established primarily as a liaison with 
the membership. It is to carry examples of statistical applications as well as 
news of statistical meetings and the like. 

For the first few years at least, the Revue will concentrate on discussions 
of the usefulness of statistics in different areas of business, backed up by 
numerous concrete examples. The first issue fits closely in this mold with a 
series of articles describing both general and specific applications in many 
parts of French industry. While there are articles on statistical quality con- 
trol, the applications cover a much wider field of business applications. For 
example there is a discussion of the organization of statistics with the firm, 
the application of statistics in the planning of capacity of equipment needed 
in power plants, description of an industrial experiment, and two articles on 
market research, one of which gives some very interesting data on taste- 
testing. There is an annotated bibliography describing four different books 
on statistics, all in French, which might be of interest to members of the 
Centre (pp. 97-99). 

The Revue hopes gradually to shift its emphasis from concrete applications 
to methodological articles which will be aimed at graduates of the short 
courses described above. It hopes also to publish applications of statistics 
by these graduates and by other readers, and makes an interesting appeal 
for submission of expériences malheureuses whenever a lesson can be learned 
therefrom. The Journal’s address is: 

Monsieur le Rédacteur en Chef de la «Revue de Statistique Appliquée» 
11, rue Pierre-Curie, Paris 5 éme, France. H.V.R. 


The Problem of Summation in Economic Science. A Methodological Study 
with Applications to Interest, Money and Cycles. Géran Nyblén. Lund Social 
Science Studies No. 4, Lund, C. W. K. Gleerup, 1951. Pp. xii, 289. 


Joun 8. Curpman, Harvard University 


| pene the title of this book, one might gain the impression that it is a 
study on index numbers. It is nothing of the sort. Broadly, it is no less 
than a critique of the foundations of modern economic theory, especially the 











918 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1053 


theory of distribution; specifically, it is the only attempt this reviewer has 
seen to confront the theory of games with empirical data. 

Nyblén opens in Chapter I with a discussion of “the fundamental idea that 
economic variables are produced by a mechanism, which can be described 
as a system of simultaneous equations” (pp. 5-6) and quotes as typical of g 
predominating contemporary viewpoint Samuelson’s assertion that “any 
sector of economic theory which cannot be cast into the mold of such a sys- 
tem must be regarded with suspicion as suffering from haziness” (p. 6). 

In the rest of the book Nyblén subjects this point of view to forceful 
criticism. He turns in Chapter IT to a discussion of specific economic models, 
dealing first with the Leontief system and criticizing it for being “very 
‘mechanical’ in the sense that no fundamental economic decisions are explic- 
itly tied to it...” (p. 21). It should properly be regarded, he says, as em- 
bedded in a linear programming system, which allows for choice among 
alternative production processes; but such a system can only be made deter- 
minate by a statement of social objectives (i.e., by maximization of an “ob- 
jective function”) and this implies the control of the economy by a single 
will, and therefore “can comprise no real treatment of a problem of distribu- 
tion” (p. 31). The conclusion is somewhat weakened by the subsequent 
publication of the “substitution theorem” of linear programming, but not 
invalidated.! 

Next, Nyblén discusses the marginal productivity theory of distribution, 
pointing out that marginal productivity does not determine the distributive 
shares, but only determines schedules of demand for factors and supply of 
products; distribution is then determined in Walrasian fashion by the 
interrelation of the demand and supply schedules of firms and households. 
If markets are competitive, the solution is determinate, and “the distribution 
process described is automatic, because no particular agreements of any kind 
between the decision units are needed for it to function, and it is ultra- 
harmonic, because no conflicts of interest are present and no unit has more 
influence on the price-determination than any other” (p. 41). On the other 
hand if monopolistic or imperfect markets are introduced, the system be- 
comes overdetermined; for, as Marschak has pointed out, the addition of 
sloping demand functions adds more equations to the system than unknowns, 
and these functions can therefore not be independent of one another if 
markets are to be cleared. Distribution is then left unexplained. 

From this impasse Nyblén is led to a consideration of the theory of games. 
He notes the distinction made by von Neumann and Morgenstern between 
“inessential games,” in which the payoff that goes to a set of players is al- 
ways equal to the sum of the amounts those players would receive when 
acting independently, and “essential games” in which the amount received 
by a coalition always exceeds the amount that its members could obtain 
independently. This is where “summation” comes in: the proposition that 





1T. C. Koopmans (ed.), Activity Analysis of Production and Allocation, New York, 1951, chapters 
VII-X. 








r hag 


s that 
Tibed 
l of a 
any 
4 SYs- 


‘ceful 
dels, 
‘very 
‘plic- 
 eM- 
nong 
eter- 
“ob- 
ngle 
‘ibu- 
ent 

not 


ion, 
tive 
y of 
the 
Ids. 
‘ion 
ind 
ra- 
ore 
her 
be- 


ns, 
if 


eS, 
en 
i]- 


od 
in 
at 


rs 


BOOK REVIEWS 919 


“in general” a set of players can obtain more by coalescing than by acting 
independently is called the “first summation theorem.” In Nyblén’s words: 
“there is always one part of the national income the distribution of which 
can and must necessarily be settled through agreements between the members 
of society; the distribution of the national income can never be completely 
settled in an automatic and harmonious way.” (p. 77). The “generality” 
here, it should be noted, is purely formal, and Nyblén is to be criticized for 
not making sufficient distinction between the formal and the empirical. The 
fact that firms’ revenue functions are necessarily interdependent was partly 
recognized by Chamberlin, and it is curious that Nyblén includes no discus- 
sion of the former’s solution to the problem in Chapter V of the Theory of 
Monopolistic Competition. However it must be admitted that there is still 
considerable oligopolistic indeterminacy left in the general equilibrium of 
monopolistic competition, so that there is ample justification for Nyblén’s 
view that the system can be neither “automatic” nor “ultra-harmonic”. 

Next, Nyblén takes issue with the assumption of transferable utility, that 
is, with the postulate of von Neumann and Morgenstern that the utility lost 
by one player or set of players is equal to the utility gained by the remaining 
set. As von Neumann and Morgenstern were forced to admit, this boils down 
to the assumption that payoffs are in monetary terms and that players maxi- 
mize the expected value of monetary returns rather than utility. Nyblén 
takes issue with transferable utility on the basis of Arrow’s proposition that 
“in general” no social welfare function can be constructed from individual 
utility functions, so no common standard of value exists; this is the “second 
summation theorem”. He goes on to state: “If such a common preference 
scale exists there can be essentially no diversity of interests at all, and the 
distribution process can constitute no problem” (p. 95). This statement is 
rather extreme, for even if individual orderings of commodities are identical, 
so that a social welfare function can be established, utility is still not trans- 
ferable, that is, interpersonal comparisons still cannot be made. Furthermore, 
there is still room for struggle over distributive shares. Thus the second sum- 
mation theorem is rather a will-o’-the-wisp. 

Once we settle for monetary payoffs, the transferability assumption still 
leaves a serious problem: the constant-sum character of the game. In order 
to deal with non-zero- (or non-constant~) sum games von Neumann and Mor- 
genstern introduced, as is well-known, a fictitious n+1-th player who re- 
ceives (or pays) the difference; however, regarding as a “patent absurdity”? 
the notion that this player can make bribes, they limited the solutions of non- 
zero-sum games to discriminatory ones in which “nature” is allowed by the 
real players to receive only a specified amount—in the extreme case, only 
what it could obtain in isolation. As a result there is little to distinguish 
this from the constant-sum game. This strikes me as a principal weakness 
of the theory, and it is reflected in Nyblén’s treatment (p. 91) in which the 








2 John von Neumann and Oscar Morgenstern, The Theory of Games and Economic Behavior, Prince- 
ton, 1947, p. 513. 








920 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1033 


n real players cooperate in order to maximize their payoff from nature 
(this is the “production problem”) and then fight over the spoils (the “dis. 
tribution problem”). This interpretation neglects what is surely the distinc. 
tive feature of the economic “game”: the way in which tlie pie is distributed 
affects the size of the pie itself. Nyblén is conscious of the artificial nature 
of this dichotomy between production and distribution, and finally confesses 
that he is “not able to point out a synthesis between the two extremes” 
(p. 128). As a second-best solution, he concludes that the main features 
of the free economy are best analyzed in terms of distribution theory rather 
than by production theory—by the theory of games instead of by models 
which can be expressed in terms of systems of equations. 

As a first step in applying the theory of games, Nyblén tackles the aggrega- 
tion problem (pp. 57-64). Formally, there are great difficulties in game theory 
in aggregating players into indissoluble groups. Some readers may be un- 
satisfied (though intrigued) by his procedure of invoking “incomplete in- 
formation” and “ ‘irrational’ socio-psychological factors” in order to justify 
aggregation of players into groups (forming constant-sum subgames) which 
themselves obey the “rational” dictates of game theory; perhaps it would 
be better to treat these groups as “teams” in Marschak’s sense.‘ 

We come then to the empirical part of the book. In Chapter V the author 
divides the economy into four groups: workers, farmers, entrepreneurs, and 
capitalists or rentiers (he gives them the misleading name “savers”) and at- 
tempts to explain the share of the latter in the national income. He sets 
himself the task of explaining the close correlation between the price level 
and the rate of interest before 1932, and the gradual rise in the price level 
relative to the rate of interest after that date. Following a most interesting 
discussion of the Jevons-Wicksell theory of interest and its extension by 
von Neumann and Hawkins, Nyblén discusses the abstinence and Keynesian 
theories, concluding with the view (pp. 150-1) that both are rationalizations 
of economic events, the first prior to 1930 and the second subsequently. 
Taking as a measure of rentiers’ relative income share the ratio of the inter- 
est rate to the general price level (p. 169), Nyblén concludes from the data 
that rentiers’ share in national income was relatively constant before 1930 
and began to decline thereafter. As measures of “the” interest rate he chooses 
railroad bond yields and rediscount rate for the U.S., and yield of consols 
and Bank Rate for Britain. The choices are rather unfortunate owing to the 
fact that the ratio of interest rate to price level is a fairly accurate index of 
relative income of bondholders only in the case of short-term privately- 
held securities. Faced with declining yields on consols, widows and orphans 
can postpone their euthanasia indefinitely by holding on to them, and in the 
case of railway bonds can postpone the day of reckoning for a generation. 





3 The theory of non-constant-sum games is still in the early process of development. Cf. John F. 
Nash, Jr., “The Bargaining Problem,” Econometrica, April 1950, and Howard Raiffa, “Arbitration 
Schemes for Generalized Two-person Games,” Contributions to the Theory of Games, Vol. II, edited by 
H. W. Kuhn and A. W. Tucker, Princeton, 1953, pp. 361-87. 

4 Econometrica, July 1953, pp. 485-6. 





odels 


Tega- 
leory 
2 un- 
e in- 
istify 
rhich 
ould 


thor 
and 
1 at- 
sets 
level 
evel 
ting 
.-by 
sian 
ions 
tly. 
ter- 
lata 
930 
Ses 
ols 
the 
: of 
ly- 
ns 
the 


921 


BOOK REVIEWS 


Similarly, a rise in yields is of no temporary use to bondholders during a 
period of inflation (with consols, no use at all) unless their holdings are short- 
term. It is only in the long run that parallel fluctuations of interest rates and 
prices can indicate stability of interest-income. Perceiving however that 
there was, as we may grant, a marked change after 1930, Nyblén comes forth 
with the hypothesis (p. 166) that before 1930 independent central banks 
carried out policies designed to maintain stability in income shares, whereas 
after that date they lost their independence and came under the political 
control of laborers, farmers, and entrepreneurs. In the language of game 
theory, bondholders after 1930 became the “excluded player” in a now dis- 
criminatory four-person game. Thus, concludes Nyblén, “thetheory of games 
gives a theoretical structure capable of comprising such sharp changes, which 
is remarkably different from the potentialities of traditional economics” 
(p. 165). 

While Nyblén’s emphasis on the political determination of distributive 
shares is interesting, his claims for the theory of games are inadequate. 
There is nothing within the theory of games to explain the change from an 
objective to a discriminatory solution; this follows necessarily from the 
static nature of game theory. The change remains exogenous and unex- 
plained. The theory of games takes the “accepted standard of behavior” 
as given, while it is this that is mostly in need of explanation. Like a ward- 
robe which provides suits for all occasions, the theory of games can no doubt 
provide categories of solutions to fit all the possible facts; but no amount 
of study of that wardrobe will predict or even explain what its user will 
wear tomorrow. And even then the clothes are completely out of character 
with the wearer, and one cannot help feeling that they fit very uncomfortably. 

In Chapter VI Nyblén turns to the problem of international distribution 
of income. In the course of lengthy excursions into the quantity theory of 
money, the Patinkin controversy, and the purchasing power parity theory, 
he makes the following observations: that according to the quantity theory 
inflation leaves relative prices, and consequently the distribution of incomes, 
unchanged (Nyblén fails to stress the fact that this analysis is applicable 
only to a stationary economy) and that, if purchasing power parity is as- 
sumed as well, inflation leaves international distribution of income (measured 
in some currency) unchanged. It is curious that Nyblén does not discuss the 
inadequacy of such a measure of a country’s real income, even if the latter 
can be said to exist. He then asserts that the purchasing power parity and 
quantity theories were valid in some periods but not in others, and seeks a 
“theory of theories”. The latter turns out to be the theory of games with de- 
composable characteristic functions. These are constant-sum games in which 
the players are divided, say, into two sets (the sets will be countries) with the 
property that the amount that any group from one set can obtain together 
with any group of the other set is the same as the amount the two groups can 
obtain in isolation; in other words, there is no advantage to be gained from 
inter-country coalitions. In spite of this property the solutions of this game 











922 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


are not necessarily decomposable, that is, it is not true “in general” that the 
sums going to each country are constant. Thus, if the players in one country 
fail to coalesce, they may fail to get their “due”, and a transfer—called an 
“Excess”—will then take place to the other country—a tribute without 
being a bribe. The case in which the tribute is zero is said to be “exceptional” 
(pp. 214-15), but later we are told (p. 220 )that its opposite seems “excep- 
tional” ; again the distinction between the formal and the empirical is blurred, 
and no attempt is made to give this tribute any interpretive meaning. As a 
result, a factitious hypothesis emerges: that “the observations consistent 
with the purchasing power parity theory imply the prevalence of a zero 
Excess between the countries studied” (p. 223) and “the observations show- 
ing successive changes of the relations between the national incomes com- 
pared imply the presence of a non-zero Excess” (p. 224). If this were all 
there was to the hypothesis, the same objections would hold as were pre- 
viously presented; but in this case there is an additional (but hardly startling) 
observation (pp. 224-7): A non-zero Excess comes about only if some region 
is not sufficiently integrated into a coalition; futhermore, once such an Excess 
has developed, one may expect a distribution struggle to ensue, taking the 
form of competitive inflationary movements. Nyblén makes special note of 
(1) the pre-1914 period, in which the purchasing power and quantity theories 
are said to be valid (apparently in the trivial sense that both exchange rates 
and relative price levels were steady), (2) the period of the early twenties 
with violent fluctuations in terms of trade, and (3) the period after the second 
World War and before Korea, characterized by a worstening of Europe’s 
terms of trade. The latter events he blames on Europe’s lack of integration, 
and the policy recommendations follow naturally. 

Nyblén finally turns, in Chapter VII, to an analysis of business cycles. 
He criticizes econometric models depicting cycles as oscillations around an 
equilibrium derived from difference or differential equations, and after dis- 
cussing the works of Schumpeter, Wiener, Domar, and Dahmén, comes 
forth with his own novel theory of cycles. To begin with, “economic progress 
and economic crises and depressions are most intimately connected” (p. 
266) for the following reasons: an innovation, taking the form of a specific 
investment, raises capital values and lowers capital values in other spheres 
of the economy, thus changing the “objective possibilities” of the situation 
(specifically, the characteristic function of the game); there ensues “a dis- 
tribution struggle which we believe to be the essence of crises and depres- 
sions” (p. 266) since it results in bankruptcies and a “breakdown of the pric- 
ing system” (p. 262). The depression is intensified by a distribution struggle 
among major social groups (in addition to conflict among businesses) brought 
about by political upheavals (p. 269n) and only brought to an end when the 
distribution struggle has been settled. Then the revival takes place, since the 
previous innovations had opened up profitable opportunities that had been 
neglected during the distribution struggle. 

Nyblén’s business cycle theory impresses me as being the least artificial 
of his hypotheses, and it is noteworthy that it is also the most original and 








1953 


the 
try 


Dut 
al” 
ep- 
ad, 
38 
nt 
TO 

W- 


all 





poOK REVIEWS 923 


least tied down to the specifications of a particular game-theoretical construc- 
tion. 

An interesting question which emerges from this discussion is that of deter- 
minacy. Nyblén points out that the distribution struggle is settled through 
agreements and therefore not automatic and harmonious; but if its outcome 
can be predicted at all, is it not at least automatic? The question is not one 
of prediction with certainty versus prediction with a given probability, for 
Nyblé4n rejected stochastic systems-of-equations systems along with the 
rest (p. 5). A partial exit to this impasse might be found in the dynamic 
character of the model, for even if a distribution struggle is settled, a lot of 
time elapses during which the struggle goes on. He admits that the outcome 
“could never be uniquely predictable” (p. 265), yet appears to be committed 
in principle to a belief in the ultimate predictability (at least in a stochastic 
sense) of socio-economic events. The answer seems to be that a theory is 
non-automatic and non-harmonic only if the phenomena it describes are not 
determinate within the economic system, but only within a wider universe; 
and in this wider universe, it appears that hypotheses cannot be expressed 
in terms of systems of equations, but must find some other, qualitative, 
expression. It has been suggested to me (by D. Ellsberg) that the kind of 
prediction that Nyblén and other game theorists may have in mind consists 
of a narrowing down of the class of possible solutions; thus one might be 
able to predict the range of a variable without any specification of a proba- 
bility distribution over that interval. 

Nyblén has done economists a service by attempting to apply the theory 
of games to the facts, but the results cannot be considered conclusive, as the 
theory is imperfect and the statistical methods are crude. Moreover the 
analysis is frequently marred by misplaced concreteness. More important, 
however, is his insistence on the role of political and social phenomena in the 
explanation of economic events. His study remains an exploration into an 
as yet little-known world. It is to be hoped that his work will stimulate others 
jnto seeking answers to some of the fundamental questions he raises. 


Measurement of Productivity. Organisation of European Economic Cooperation. 
Paris: 1952. (U.S. Distribution Agent: Columbia University Press). Pp. 104. 
$1.25, Paper. 


Peter O. STEINER, University of California (Berkeley) 


Many Americans have had the opportunity of meeting with members of 
one or another of the groups of foreign visitors who have come to the United 
States under the sponsorship of the technical assistance program of Euro- 
pean Cooperation Agency. This thin monograph is a report on what was 
learned about methods of productivity measurement by the members of 
three such groups who visited the United States in 1950 to study the produc- 
tivity division of the Bureau of Labor Statistics. Each mission spent five or 
six weeks in Washington listening to lectures by BLS department heads, and 
four weeks in the field visiting industrial firms, universities, and regional 





924 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


offices of BLS. The report seems to be largely a condensed summary of the 
notes they took. 

Americans will be interested in the report chiefly for appraising whether 
missions of this sort are worthwhile methods of communication. I offer no 
opinion on this subject. The groups felt “the visit to the United States 
proved of great value both for the discussions and exchanges of views which 
it made possible, and for the cordial relations established between the rep- 
resentatives of the Member countries,” and urged that further missions be 
organized in the future. 

The material covered includes: history and organization of the BLS; uses 
of productivity measures; problems in defining and measuring input, out- 
put, and productivity; procedures for collection of data by direct inquiry 
and from secondary sources; and appendixes containing sample question- 
naires, lists of data ‘available, and methods for computation of indexes, 
Since the text is very short (about 30 pages), it is evident that treatment of 
each topic is brief; actually brevity approaches superficiality on most points. 
From the point of view of content, more systematic and adequate treat- 
ment of the issues is available in many publications; see, for example, the 
International Labor Office, Methods of Labour Productivity Statistics, (Ge- 
neva, 1951), or for a more technical treatment, Irving Siegel, Concepts and 
Measurement of Production and Productivity, (Washington, 1952). 

One apparent purpose of the report is to provide information to the Euro- 
pean countries from which the missions were drawn that might be useful 
to them in establishing, revising, or expanding their programs of productivity 
measurement. On most technical issues, as previously suggested, alternative 
sources will be more helpful. The report does a service, however, by warn- 
ing that BLS methods cannot be transferred without being carefully investi- 
gated and adapted. One of my colleagues tells of the time he requested his 
students in an examination to visualize themselves as top executives and 
indicate how they would solve a particular problem he then set before them. 
One paper was turned in almost immediately. The student had written “I 
would hire you as a consultant.” In much the same way one feels that the 
report implicitly urges any government considering productivity measure- 
ment to hire a BLS statistician as a consultant. This, of course, is sound ad- 
vice. 


Concepts and Measurement of Production and Productivity. Irving H. Siegel. 
Washington: U. 8. Bureau of Labor Statistics, 1952. Pp. 108. Paper. 


Artuur L. Brora, Board of Governors of the Federal Reserve System 


Irving Siegel has been concerned for many years with the statistical meas- 
urement of production and productivity, both as a practitioner at the WPA 
National Research Project and the Bureau of Labor Statistics and as a stu- 
dent. This study is in good part a synthesis of the views originally set forth 





IR 1953 


of the 


iether 
er no 
tates 
vhich 
 Tep- 
ns be 


uses 
out- 
juiry 
tion- 
2Xes, 
nt of 
ints, 
‘eat- 
the 
‘Ge- 
and 


uro- 
eful 
rity 
Live 
n- 
sti- 
his 
ind 


p0OK REVIEWS 925 


by him in this Journat and elsewhere. It has been reproduced as a working 
paper of the National Conference on Productivity, and can be obtained from 
the BLS. 

Between the introduction and summary are four substantive chapters, 
one dealing with concepts and three with technical matters. The most valua- 
ble material is in the technical sections. Among the subjects investigated 
are the relationships between alternative indexes of production (e.g., Paasche 
and Laspeyres) and productivity (e.g., indexes derived by relating employ- 
ment measures to value-weighted and labor requirement-weighted quantity 
indexes); the nature of aggregates; directly calculated indexes and those de- 
rived by deflation; the relationships among indexes of gross output, net out- 
put, and materials consumption; alternative formulations of given indexes; 
coverage adjustments; and the decomposition, or “partitioning,” of changes 
in aggregates into additive contributions of various elements. 

The notion of the multiplicity of legitimate measures of “production” and 
“productivity,” presented in the introduction, is properly given strong em- 
phasis. Also useful is the review of the meanings given these terms in the 
literature of economic theory, national income, and index numbers, which 
is found in the chapter on concepts. The summary chapter contains some 
interesting proposals for research. 

Questions may be raised concerning certain of the main ideas presented. 
A great deal of space is given to the “multiperiod macrotype,” a notion in- 
tended to rationalize the numerical comparisons given by indexes. The 
author observes that value theory permits only ordinal comparisons, and 
these only under highly restrictive assumptions of constancy in tastes, tech- 
nology, etc., so that the usual production and productivity indexes “do not 
have any ‘economic’ import.” The solution is to imagine something called 
the macrotype (also referred to as a “fictional creature,” a “decision maker,” 
a “mythical appraiser,” a “generalized consumer equally at home in all pe- 
riods,” and a “personification of a formula”) whose “relevant behavior is 
not ‘economic’ in the ordinary sense but is described by the specific content 
and structure of the index”—i.e., in whose eyes the index is numerically sig- 
nificant—and then “to judge its plausibility.” 

Apparently there are separate macrotypes for indexes of every possible 
content and structure. The author does not discuss the bases on which their 
relative “plausibilities” are to be determined, but presumably they would 
be the same as are relevant in evaluating the indexes directly. It is not clear, 
therefore, that the interjection of the macrotype greatly facilitates matters. 
The author believes that “The notion of the ‘macrotype’... dramatizes 
the value judgments that underlie numerical comparisons” (page 10) and 
“Without some such conception we should probably have to abandon at- 
tempts to measure changes in the ‘physical volume’ of the physically chang- 
ing goods of an advanced industrial society” (page 39). 

Under a proposal for a “sub-product” approach to index construction, in- 












926 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


dex makers are urged to classify their data in terms of sub-products of the 
vertical stages through which each end-product passes. The sub-products 
would be measured in characteristic units and assigned incremental weights, 
This procedure is advanced as preferable to making indexes from end-prod- 
uct data for various reasons, most of which can be summarized under the 
headings of greater accuracy and greater flexibility for analysis. 

The sub-product approach, in effect, is already used in the production in- 
dexes of most countries. These indexes follow an “industry” organization, 
with industries separated from one another both horizontally and vertically. 
Thus, there usually are separate industries, and separate index components 
with incremental (value added) weights, for iron ore, pig iron, and steel. 
To extend the method further would require that the “industry” categories 
now used be refined vertically into smaller elements. Undoubtedly this could 
be done in some lines, and it would be desirable to carry it as far as possi- 
ble—particularly where inter-stage inventories are customarily accumulated, 
so that operating rates can differ from one stage to the next. However, in 
most industries as presently defined in the United States any further vertical 
refinement would mean separating successive processes within individual 
plants. With a few exceptions, the difficulties of reporting quantities and, to 
a greater extent, values, would multiply very quickly. 

The author says that the necessary “reorientation of Federal and other 
statistical reporting systems on a grand scale seems very unlikely,” but 
seems to think that it is feasible, and may be undertaken “after some dis- 
illusionment” with present measures. The feasibility of such a large-scale 
program in the foreseeable future is extremely dubious, and if this is the 
“key to substantial further progress,” substantial progress is improbable. 
But there are many keys to progress. One, for example, would be to continue 
to fill out the list of products for which reliable current output data are pos- 
sible. Significant advances have been made, but the gaps still remaining are 
important. Even apart from feasibility questions, this project might well be 
given precedence over the collection of sub-product data. Progress can also 
be made in other important ways, such as refining industries horizontally— 
that is, estimating the value added weight for a product class from census 
data for plants concentrating on that class of product. While this, too, has 
definite limits (some commodities are almost always produced in associa- 
tion with certain others) it would require only more effective exploitation 
of existing data and not collection of additional data. Also, it would be use- 
ful to supplement the usual indexes, covering successive stages and employ- 
ing incremental weights, with value-weighted end-product measures, con- 
fined, say, to finished consumers’ goods and producers’ equipment. Such 
end-product measures would serve many purposes not adequately served by 
present indexes. 

In another recommendation, “free-composition” indexes are advanced as 
preferable to the “customary” chain indexes for resolving “the problem of 
discontinuity of product series due to changes in classification, specifica- 








pOOK REVIEWS 927 


tions, and variety of goods made and reported” (page 70). The discussion 
turns on only one of these points—changes in the variety of goods made. 
To handle this problem, which is largely one of “new” products, the author 
proposes writing zero’s for periods when the production of a particular com- 
modity was zero, and inventing a hypothetical price for weighting purposes 
if production was zero in the weight year. This reasonable solution to the new 
product problem has in fact already been used explicitly or implicitly in a 
number of instances. 

This alone, however, would not seem to justify condemning “the ritual of 
shifting the time base, the weights, and the product classes, and then chain- 
ing the links” as superfluous “acrobatics.” Weights are usually changed pe- 
riodically so that they may continue to be reasonably representative of cur- 
rent relationships. The resulting separate links are chained together to avoid 
breaks in the series. The question of how frequently weights should be 
changed is certainly subject to debate, as it involves a compromise between 
the gain in relevance for some comparisons resulting from recent weights 
and the difficulties of interpretation that linking introduces. The author 
does not comment directly on this subject, leaving the implication that in 
his view weights should not be changed, or, perhaps, that differently-weighted 
segments should not be linked together. Apart from weight changes, the 
linking process is often used for individual components of an index to meet 
some of the other problems the author originally lists, including changes in 
classifications used for the reported data, and changes in the variety of goods 
for which figures are reported. It is not made clear how these difficulties can 
be met by “free-composition” measures. 

Other questions may be noted. There are many references to the subject 
of “externality,” a condition which exists when an “average” lies outside 
the range of the terms being averaged. But what all the discussion is about 
is not apparent. For some of the cases where this “danger” is pointed out— 
e.g., productivity measures computed from value-weighted production in- 
dexes and labor input indexes—the relationship is not an average at all. In 
this instance the author himself demonstrates, on page 54, that the ratio 
is equivalent to the product of an average and another term. In another 
case—that of coverage adjustments—the problem as posed would appear to 
be more accurately described as a possibly incorrect assumption rather than 
possible “externality.” 

The treatment of coverage adjustments is hardly adequate. The discus- 
sion and evaluation are confined to one of the two problems such adjust- 
ments are designed to meet, that of representing quantity changes for prod- 
ucts reported in value terms only. The existence of the other problem—that 
of eliminating from an industry measure the output of industry-type prod- 
ucts actually made elsewhere, and included in the quantity data—is merely 
mentioned in a footnote. The author observes on page 63 that while the 
adjustment rests on a specific assumption (similar average price changes 
for two sets of goods), the use of unadjusted measures also implies an as- 











928 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1933 


sumption (similar average quantity changes for these goods). But three 
pages later he reports with apparent approval that both the WPA and the 
BLS rejected the coverage adjustment because, among other reasons, they 
preferred not to introduce an “ad@’’‘onal” assumption. A hypothetical ex- 
ample is used to demonstrate that adjusted indexes may yield poorer results 
than unadjusted measures in the case of new products unreported in quan- 
tity terms (page 67). Calculation indicates, however, that with the prices 
and quantities assumed in the example the “new” product accounts for about 
43 per cent of the industry’s value of output before its growth and 25 per 
cent after. With more realistic figures adjusted indexes would usually be 
found to understate the growth of new products, but by substantially smaller 
amounts than unadjusted measures. The question of new products, inci- 
dentally, while intriguing, seems to be greatly overrated as a practical prob- 
lem, at least for the United States, in this study and elsewhere. 

The author catalogues the sins of index users, and asserts that index mak- 
ers prefer comfortable tradition to “the search for promising new paths.” 
The only basis offered for this assertion is the index makers’ continued ne- 
glect of Mr. Siegel’s proposals. Also, in connection with one recent innova- 
tion touched on in the text, the author’s information is incorrect: “To over- 
come objections to publication raised by interested groups aware of the 
practical consequences of a few percentage points, the U. S. Bureau of Cen- 
sus has been obliged to release three differently weighted 1947 indexes, not 
merely one, for each manufacturing industry” (pages 6-7). Mr. Siegel and 
his readers will be glad to know that the decision in the 1939-1947 bench- 
mark index project to compile and publish six alternative indexes for each 
industry (under the three weighting systems, with and without coverage 
adjustments) was made in the earliest planning stage of the work, and with- 
out reference to the kind of considerations suggested. 

Emphasis has been given here to some of the more controversial aspects 
of the study, but as has been noted it contains much useful and instructive 
material. The volume comes at a time when interest in the field is high, 
with old indexes undergoing revision and new ones being developed in many 
countries. 








OR 1933 


three 
id the 
} they 
al @x- 
esults 
quan. 
oTIces 
bout 
> per 
y be 
aller 
inci- 
rob- 


nak- 
hs,” 
ne- 
Va- 
ver- 
the 
‘en- 
not 
ind 
ch- 
ich 
ize 
th- 


ts 
ve 
h, 
ly 





PUBLICATIONS RECEIVED 


Benedict, Murray R. Farm Policies of 
the United States, 1790-1950. New York: 
Twentieth Century Fund, 1953. $5.00. 

Bergson, Abram. Soviet National Income 
and Product in 1987. New York: Columbia 
University Press, 1953. $3.75. 

Braithwaite, R. B. Scientific Explanation: 
A Study of the Function of Theory, Prob- 
ability and Law in Science. New York: 
Cambridge University Press, 1953. $8.00. 

Brown, William Adams, Jr., and Opie, 
Redvers. American Foreign Assistance. 
Washington, D. C.: The Brookings Insti- 
tution, 1953. $6.00. 

Cairncross, A. K. Home and Foreign 
Investment, 1870-1913. New York: Cam- 
bridge University Press, 1953. $6.00. 

Cochran, William G. Sampling Tech- 
niques. New York: John Wiley and Sons, 
1953. $6.50. 

David, F. N. A Statistical Primer. Lon- 
don: Charles Griffin and Co., 1953. 22s. 

Deane, Phyllis, ed. Bibliography on In- 
come and Wealth, Vol. II. Cambridge: 
Bowes and Bowes, 1953. 37s. 6d. 

Desai, R. C. Standard of Living in India 
and Pakistan, 1981-82 to 1940-41. Bom- 
“af The Popular Book Depot, 1953. Rs 
20/. 
Doob, J. L. Stochastic Processes. New 
York: John Wiley and Sons, 1953. $10.00. 

Finney, D. J. An Introduction to Statis- 
tical Science in Agriculture. New York: 
John Wiley and Sons, 1953. $3.75. 

French, David G. An Approach to 
Measuring Results of Social Work. New 
York: Columbia University Press, 1952. 
$3.00. 

Frumkin, Gregory. Population Changes 
in Europe since 1989. New York: Augustus 
M. Kelley, 1951. 

Gadgil, D. R. Poona, A Socio-Economic 
Survey, Part II. Poona: Gokhale Institute 
of Politics and Economics, 1952. 30s. 

Ginzburg, Eli, and Bray, Douglas W. 
The Uneducated. New York: Columbia 
University Press, 1953. $4.50. 

Goedicke, Victor. Introduction to the 
Theory of Statistics. New York: Harper and 
Brothers, 1953. $4.50. 

Grebler, Leo. The Role of Federal Credit 
Aids in Residential Construction. New York: 
National Bureau of Economic Research, 
1953. Paper. $1.00. 

Hampel, Louis F. Workbook in Business 
Statistics: Third Edition. Homewood, IIli- 
nois: Richard D. Irwin, Inc., 1953. Paper. 
$3.50. 


Hartkemeier, Harry P. Elementary Sta- 
tistical Analysis. Dubuque, Iowa: Wm. C. 
Brown Co., 1953. Paper. $6.00. 

Hartkemeier, Harry Pelle. Punch-Card 
Methods. Dubuque, Iowa: Wm. C. Brown 
Co., 1952. Paper. $5.00. 

Hawley, Amos H. Interstate Migration 
in Michigan: 1936-1940. Ann Arbor: Uni- 
versity of Michigan, 1953. Paper. $1.50. 

Heck, Harold J. Foreign Commerce. New 
York: McGraw-Hill Book Co., 1953. $6.50. 

Herdan, Gustav. Small Particle Sta- 
tistics. Houston: Elsevier Press (402 Lovett 
Blvd.), 1953. $12.00. 

Hood, William C., and Koopmans, Tjal- 
ling C., eds. Studies in Econometric Method. 
New York: National Bureau of Economic 
Research, 1953. $5.50. 

Johnson, Ellis A. The Application of 
Operations Research to Industry. Published 
by the author, Operations Research Office, 
Johns Hopkins University, Chevy Chase, 
Maryland. Paper. 

Klein, Lawrence R. A Textbook of 
Econometrics. Evanston, Illinois: Row, 
Peterson and Company, 1953. 

Korner, Emil. The Law of Freedom as the 
Remedy for War and Poverty, Vols. One and 
Two. London: Williams and Norgate, Ltd., 
1951. 21s. each. 

Kuznets, Simon. Economic Change: Se- 
lected Essays in Business Cycles, National 
Income, and Economic Growth. New York: 
W. W. Norton and Co., Inc., 1953. $4.50. 

Kuznets, Simon; assisted by Elizabeth 
Jenks. Shares of Upper Income Groups in 
Income and Savings. New York: National 
Bureau of Economic Research, 1953. $9.00. 

Lacey, Oliver L. Statistical Methods in 
Experimentation, An Introduction. New 
York: The Macmillan Co., 1953. $4.50. 

Lehman, Harvey C. Age and Achieve- 
ment. Princeton: Princeton University 
Press, 1953. $7.50. 

Lindquist, E. F. Design and Analysis of 
Experiments. Boston: Houghton Mifflin 
Co., 1953. $6.50. 

Marx, Daniel, Jr. International Shipping 
Cartels. Princeton: Princeton University 
Press, 1953. $6.00. 

National Bureau of Standards. I[ntro- 
duction to the Theory of Stochastic Processes 
Depending on a Continuous Parameter. 
Washington: U. 8. Government Printing 
Office, 1953. Paper. 30 cents. 

Ore, Oystein. Cardano, the Gambling 
Scholar. Princeton: Princeton University 
Press, 1953. $4.00. 


929 





930 


Pierson, Frank C. Community Wage 
Patterns. Berkeley: University of California 
Press, 1953. $3.75. 

Piquet, Howard S. Aid, Trade and the 
Tariff. New York: Thomas Y. Crowell Co., 
1953. $5.00. 

Quenouille, M. H. The Design and 
Analysis of Experiment. New York: Hafner 
Publishing Co., 1953. $7.50. 

Revue de Statistique Appliquée, Vol. I, 
N® 1. Paris: Université de Paris, 1953. 
Paper. 

Rhys-Williams, Lady. Taxation and In- 
centive. New York: Oxford University 
Press, 1953. $3.50. 

Shephard, Ronald W. Cost and Produc- 
tion Functions. Princeton: Princeton Uni- 
versity Press, 1953. Paper. $2.00. 

Tax Institute Inc. Excess Profits Tazxa- 
tion: Symposium. Princeton: 1953. $5.00. 

Thomas, Brinley. Migration and Eco- 
nomic Growth. New York: Cambridge Uni- 
versity Press, 1953. 

Thompson, Alexander John. Tracts for 
Computers. No. XXII. Logarithmetica 


PUBLICATIONS RECEIVEp 


Britannica, Pt. 11 (Theninth and final part), 
New York: Cambridge University Press 
1953. Paper. $8.50. : 

Trumpler, Robert J., and Weaver, 
Harold F. Statistical Astronomy. Berkeley: 
University of California, 1953. $7.50. 

Virginia Agricultural Experiment Station, 
Statistical Methods for Sensory Difference 
Tests of Food Quality. Bi-Annual Report 
No. 6. Blacksburg, Virginia: 1953. Paper, 
mimeo. 

Weston, J. Fred. The Role of Mergers in 
Growth of Large Firms. Berkeley: Univer- 
sity of California Press, 1953. $3.50. 

Whitin, Thomson M. The Theory of 
Inventory Management. Princeton: Prince- 
ton University Press, 1953. $4.50. 

Woytinsky, W. S. and Associates. Em- 
ployment and Wages in the United States, 
New York: The Twentieth Century Fund, 
1953. $7.50. 

Zimmerman, L. J. The Propensity to 
Monopolize. Amsterdam: North-Holland 
Publishing Company, 1952. Paper. $2.00. 








EIVED 


I part), 
Press, 


Veaver, 
“we RANDOM DIGITS (9001-13,750) 


— From A Million Random Digits, to be published by the Rand Corporation, Santa Monica, California. 


Report Digits 6876-8125 were published in Vol. 48, p. 672 (September 1953) 


Paper, 
23780 28391 05940 55583 81256 
gers in 45325 05490 65974 11186 15357 
Jniver- 88240 92457 89200 94696 11370 
oy ¢ 42789 69758 79701 29511 55968 
Prince. 97523 17264 82840 59556 37119 
. Em. 08853 59083 95137 76538 44155 
States, 80274 79932 44236 10089 44373 
Fund, 82805 21149 03425 17594 31427 
; 64971 49055 95091 08367 28381 
sity to 03606 46497 28626 87297 36568 


olland 
— 67286 28749 81905 15038 38338 
65670 72111 91884 66762 11428 
14262 09513 25728 52539 86806 
57375 85062 89178 08791 39342 
39483 62469 30935 79270 91986 


51206 65749 11885 49789 97081 
70908 21506 16269 54558 18395 
69944 65036 63213 56631 88862 
94963 22581 17882 83558 31960 
99286 45236 47427 74321 67351 


16075 20517 69980 18293 44047 
73375 62251 58871 70174 52372 
30487 38794 36079 23362 24902 
69473 45950 18225 09899 87377 
27703 83717 18913 66371 53629 


67612 72738 26995 50933 92936 
59042 37595 04931 73622 69902 
59609 35653 15970 37681 96326 
35354 65770 15365 41422 29451 
79452 71674 30260 97303 31002 


49867 89294 59232 31776 54919 
16719 06144 82041 38332 64452 
46970 45907 99238 74547 19704 
35747 78956 11478 41195 58135 
71838 07526 07985 60714 88627 


24361 34534 70169 24805 63215 
19278 17082 26997 32295 10894 
58124 84721 23544 88548 65626 
12025 16908 82841 24060 40285 
50326 86370 91949 19017 83846 


931 





932 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


38175 38422 64677 80358 52629 
21805 10371 95812 84665 74511 
75517 82119 09199 30322 33352 
19195 92261 44757 98628 57916 
77869 08582 63168 21043 17409 


79419 22359 65206 54941 95992 
59914 04146 01419 48575 77822 
43374 25473 60982 27119 16060 
22199 11865 26201 18570 72803 
13786 27475 31254 36050 73736 


45445 41059 55142 55585 39829 
21067 57238 35352 57741 98761 
30302 95327 12849 15795 97479 
70040 91385 96436 58982 91281 
13351 48321 28357 88526 74396 


15564 04716 14594 22363 85700 
30987 57657 33398 63053 46792 
79172 72764 66446 78864 96004 
57875 45228 49211 69755 27896 
57146 64665 31159 06980 64709 


42826 06974 61063 97640 13433 
93929 01836 36590 75052 89475 
83585 00414 62851 48787 28447 
27548 37516 24343 63046 02081 
32982 56455 53129 77693 25022 


30104 67126 76656 29347 28492 
35240 00818 09136 01952 48442 
94031 62209 43740 54102 76895 
99321 11331 06838 03818 77063 
78236 71732 04704 61384 57343 


43108 56592 42467 88801 91280 
91058 60958 20706 31929 57422 
98172 44346 60430 59627 26471 
12523 57345 41246 98416 08669 
66682 82517 33150 27368 53375 


01056 27534 23085 49602 74391 
18730 96197 64483 40364 90913 
07794 60475 49666 17578 12830 
48883 77154 74973 42096 34934 
70171 59431 76033 40076 20292 


48830 55029 10371 09963 85857 
73151 64463 50058 11468 93553 
06571 95934 09132 13746 82514 
76609 52553 47508 25775 91309 
32138 61197 95476 69442 54574 


04855 27029 01542 72443 72302 
65434 12124 91087 87800 34870 
_ 86800 16781 65977 65946 65728 
51233 81409 46773 69135 36170 
92933 77341 20839 36126 18311 





RANDOM DIGITS 


53266 
06779 
41957 
96837 
74839 


61638 
05756 
46064 
31926 
35145 


34755 
49925 
33225 
19136 
61324 


83246 
99644 
82500 
15544 
77589 


48855 
22445 
04446 
32775 
00216 


37795 
11036 
25460 
63638 
20033 


67885 
52380 
93140 
84663 
91830 


79937 
32960 
12779 
07521 
78985 


02968 
39935 
33623 
80423 
39750 


31180 
58682 
43277 
86354 
89239 





934 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


30579 44188 25432 35278 03553 
16051 04475 13400 25994 11774 
83366 14896 04219 82244 74473 
83715 94737 69973 00332 33049 
13648 85250 28323 20385 08095 


05128 59866 51281 68124 75064 
86746 89698 56020 37810 88684 
87513 17690 61427 72914 48563 
02622 41026 80875 41293 21529 
64981 28180 38629 76962 93285 


57888 13938 38554 86836 02195 
56316 37723 00234 21424 26664 
98849 72762 59767 52497 24227 
51632 54799 27973 68568 68465 
12874 82160 67202 85199 27908 


57580 77884 07032 01671 53362 
51875 64611 19736 25589 46569 
39133 30393 58319 85098 66519 
24541 61477 89731 18421 29861 
50859 84746 28302 - 13264 07595 


28119 24200 09110 28485 30326 
45206 53300 38688 39968 32604 
57571 65919 56405 17839 92073 
52829 01172 08915 11467 14793 
00134 36233 89434 38669 91592 


99826 64005 94325 73553 78280 


11694 46262 55067 64603 59752 
57622 93328 98885 07783 04351 
82691 51238 14106 43983 33356 
88799 65621 59809 37850 66128 


69125 95591 81168 99246 66416 
74698 44233 67602 21615 72336 
77451 47350 21234 67672 80567 
61715 96485 22121 98844 59289 
92735 45064 50924 00865 19690 


72353 45775 68590 85685 99975 
12979 05720 92754 76911 55240 
44365 70254 50864 36619 30094 
49076 18439 29522 42541 79327 
78143 65919 13699 91844 10676 


03474 76025 97043 33834 44638 
35870 89158 55864 98078 50563 
73887 67928 60045 70782 11937 
45968 73667 65062 73306 76045 
67622 54579 17279 67440 56441 


66913 60664 67547 39523 02043 
74859 62155 09234 47367 13047 
90879 44969 11129 17139 79630 
95909 82459 96218 60768 76417 
29212 40873 41590 67255 30757 











She new statistics for 


decision-making 


—recently developed principles and 
techniques that have already proved 
remarkably effective—are presented in 


DESIGN FOR 
DECISION 


by 
Irwin D. J. Bross 


The striking new statistical theories incorporating value 
factors, developed during the past ten years, are explained 
in this new book with brilliant clarity. You will find here 
the meat of these theories, the basic procedures, and the 
many ways in which they can be effectively applied. The 
author analyzes the nature of decision, presents the funda- 
mentals of prediction and probability, and then discusses 
the new procedures for ascertaining values, setting up the 
factors in a problem, appraising data, using models, sam- 
pling, measurement, and inference. Because these methods 
are proving so valuable both in industry and in science, you 
will want the information in this book. 


$4.25 at your bookstore or from 


60 FIFTH AVENUE, NEW YORK 11 





Please mention the Journal of the AMERICAN STATISTICAL ASSOCIATION in writing advertisers 











..... When a student enrolls in an introductory 

statistics course, both the professor and the chosen text can deter- 
mine which tack he'll take. Dr. Lacey’s primary concern in Statistical 
Methods of Experimentation is to present elementary statistical 
techniques in a meaningful, logical manner. The methods are pre- 
sented as a means of inference or decision-making and not as mystic 
manipulations to be memorized as an end in themselves. He includes 
such topics as t-test, chi-square, regression, and correlation tech- 
niques. With an understanding of these techniques, a student can 
follow and interpret eighty per cent or more of experimental reports 
in the literature and can plan and execute significant experimental 


work of his own. 1953 249pp. $4.50 


STATISTICAL METHODS IN EXPERIMENTATION 
by Oliver Lacey The Macmillan Company 


60 FIFTH AVENUE, NEW YORK 11, N.Y. 











ESTADISTICA 


Journal of the Inter American Statistical Institute 


Vol. XI, No. 41 December 1953 
CONTENTS 


Las Estadisticas y el Interés Pablico (traduccién) ......... .. -Arynesa Joy Wickens 

Movimientos peetesies Internos owl Costa Rica Registrados Por el Censo de 1950 
ilburg Jiménez Castro 

Mark Sensing the Conaéien Census ‘Seenis . 

El Censo de Poblacién de la Colonia Bethania, Guatemala, Marzo 1953 Vicente Secaira E. 

Basic Agricultural Statistical Methods ...........+eeeeeseeeee .+.-Oharles F. Sarle 


Informe de la Comisién Especial para - Estudio del Costo de » hens de los beapaenyel 
dores en Venezuela 


Aspectos vepyeree een del Censo Industrial Aagentine Peapectete pore, 1954 . 
José Maria Rivera 

El Censo Nacional de Cuba, 1953 

El Centro Interamericano de Bioestadistica, Chile ..... cntipeneeaes Mornin penta 

Spanish Coding Manual for Application of the SITC John B. Rothrock 


Institute Affairs. Statistical News. Publications. 
Published quarterly. Annual subscription price $3.00 (U.S.) 


Inter American Statistical Institute 
Pan American Union 
Washington 6, D.C. 





Please mention the Journal of the AMERICAN STATISTICAL ASSOCIATION in writing advertisers 











Important HOLT Texts 
STATISTICAL INFERENCE 


HeLen M. WALKER and JosEPH LEV 
“An exciting text—one which has broken from the tradi- 
tional curriculum of the undergraduate course in statistics. 
. . . Its broad coverage and intuitive approach are to be 


commended.” 
Jacob Cohen, City College of New York 


STATISTICS FOR SOCIOLOGISTS, Revised 


MarGaRET JARMAN Hacoop and DanieEt O. PRICE 
“An excellent textbook has been improved by new materials 
on scaling, sampling design and factor analysis. The illustra- 
tive materials have in general been taken from late studies 
and enumerations.” 
David B. Carpenter, Washington University 


ELEMENTARY STATISTICS, Revised 


Morris M. BLair 
“Particularly well-suited to the beginning course in sta- 
tistics. The many illustrations and detailed example solu- 
tions should be of great assistance in teaching.” 
S. L. McDonald, University of Texas 


FIRST COURSE IN PROBABILITY AND STATISTICS 
Jerzy NEYMAN 


“,.. the presentation is the best yet, on an elementary level, 
of the conceptual bases for a substantial part of present 
statistical theory. . . . Its reading should prove reward- 
ing. ...” 

J. E. Morton, Cornell University, in the American 
Statistical Association Journal 


MATHEMATICS ESSENTIAL FOR ELEMENTARY 
STATISTICS, Revised 


HELEN M. WALKER 





“This book is excellent. The topics are well-chosen and 
beautifully presented. .. .” 
D. A. Grant, University of Wisconsin 


HENRY HOLT and COMPANY 


383 Madison Avenue 
New York 17, New York 


Please mention the Journal of the AmenicaN Statistical. Association in writing advertisers 








THE FOOD AND AGRICULTURE ORGANIZATION 
OF THE UNITED NATIONS 
PUBLICATIONS ON AGRICULTURAL 
ECONOMICS AND STATISTICS 


MONTHLY BULLETIN OF AGRICULTURAL ECONOMICS 
AND STATISTICS 

This bulletin includes articles on international developments in the agricultural economic 
situation and on statistical methods, brief notes on current commodity situations, authoritative 
world wide data (by countries) on production, prices and trade in farm products, and other 
information including results of national agricultural censuses. Published in the third week 
of each month, and includes data received up to the 25th of the preceding month. (In English, 
French, and Spanish. About 50 pp. each, 84 x 11”. Annual subscription: $5. or 25 sh.) 


YEARBOOK OF FOOD AND AGRICULTURAL STATISTICS, 
PART I—PRODUCTION, 1952 
Authoritative survey of basic data for all countries on land use, agricultural production, 


crop and livestock production, food supply and agricultural requisites, pre-war and post-war 
years. (In English, French, and Spanish. 300 pp. 8¥2 x 11”. $5. or 25 sh. a copy.) 


YEARBOOK OF FOOD AND AGRICULTURAL STATISTICS, 
PART II—TRADE, 1952 

Authoritative basic data on world trade in farm products and agricultural requisites by 
countries and continents, pre-war and annually 1948 to 1951, with notes on data by com- 
= and — (In English, French, and Spanish. About 250 pp. 842 x 11”. $3.50 or 
17/6d. a copy. 


SECOND WORLD FOOD SURVEY, 1952 


Appraisal of world food supplies and nutritional value, pre-war and post-war; of trends 
in food production and consumption, and of feasible increases in production to meet 
reasonable nutritional goals; with maps and country tables. The analysis covers the world 
situation by regions, and includes relevant data for individual countries. (In English, French, 
and Spanish. 59 pp. 8¥2 x 11”. $0.50 or 2/6d. a copy.) 


COMMODITY REPORTS 


Brief appraisals of the individual commodity situations and outlook: Carpet Wool (1953); 
Grain (Jan. 1953); Rice (Dec. 1952); Cocoa (July 1952). (Mimeographed $0.25, or 1/3d. a copy.) 


COMMODITY BULLETINS 


Comprehensive studies reviewing production and trade developments: Sugar t. 1952); 
Tobacco (Oct. 1952); Dairy Products (Feb. 1953). ($0.50 or 2/6d. a copy.) atti 


COMMODITY POLICY STUDIES 


Analytical studies of national and international policies for agricultural products: No. 1, 
A Reconsideration of the International Wheat Agreement t. 195%), (Pp. 40, $0.50 or 
2/6d. a copy). No. 2, Survey of National Measures for Controlling Farm P in Western 


European Countries (70 pp., April 1953) ($0.75 or 3/9d. a copy). No. 3, The Long Term 
Contract (April 1953), (46 pp. $0.50 or 2/6d. a copy). ” 


All FAO publications are available on order from: 


United States: Columbia University Press Canada: The Ryerson Press 
International Documents Service 299 Queen Street West 
2960 Broadway, New York 27,N.Y. Toronto 2, Ontario, Canada. 


United Kingdom: H.M. Stationery Office or: Documents Sales Service, FAO 
P.O. Box 569 Viale delle Terme di Caracalla, 
London, S.E. 1. Rome, Italy. 


Please mention the Journal of the American STAtisticaL ASSOCIATION in writing advertisers 














new book announcements 
McGRAW-HILL BOOK COMPANY, 





PRINCIPLES OF NUMERICAL ANALYSIS 


By Atston S. HouseHo per, Oak Ridge National Laboratory. International Series 
in Pure and Applied Mathematics. 274 pages, $6.00 


Here is a senior-graduate text which develops the mathematical principles upon which 
many computing methods are based and in the light of which they can be assessed. 
Directed primarily toward digital computation, the book is designed to give a unified 
treatment rather than a complete catalogue of methods. Treatment is primarily theoreti- 
cal. Techniques for making estimates of errors are indicated wherever possible. Func- 
tional equations as such are not discussed, but emphasis is placed upon the methods of 
solving the finite systems and performing the interpolations which are required in the 
digital solution of functional equations. 


INTRODUCTION TO STATISTICAL ANALYSIS 
Pca J. Drxon and Frank J. Massey, Jr., University of Oregon. 370 pages, 


This unique text presents the basic concepts of statistics in a manner which will show 
the student the generality of the application of the statistical method. Both classical and 
modern techniques are presented with emphasis on the understanding and use of the 


technique. 


INTRODUCTION TO THE THEORY OF STATISTICS 

By ALEXANDER Moop, Rand Corporation, Santa Monica, Calif. 431 pages, $6.50 
A text for standard courses in statistical theory with a calculus prerequisite. The author 
develops the theory of probability, distribution and sampling. The book explores the two 
major problems of scientific inference: the estimation of quantities, and the testing 
of hypotheses. 


STATISTICAL THEORY IN RESEARCH 


By R. L. ANpERSON, University of North Carolina, and T. A. BANcrort, Iowa 
State College, 399 pages, $7.00 


A combined text and reference work, this book presents basic statistical theory, including 
the elementary principles of probability, population, and sampling theory with moment- 
generating functions, orthogonal linear forms, and the theory of estimation and tests 
of significance. The last half of the book covers the theory of least squares and its use 
in the analysis of actual experimental data, including multiple regression, experimental 
design, and variance component models. 


Send for copies on approval 





McGRAW-HILL BOOK COMPANY 
330 West 42nd Street e New York 36, N. Y. 





Please mention the Journal of the AMERICAN STATISTICAL ASSOCIATION in writing advertisers 














The biography of a Mathematical genius 


Cardano. the 
Gambling Scholar 


By OYSTEIN ORE. Cardano, next to Vesalius the greatest 
physician of his day, was also a devoted and skilled gambler 
who played for personal pleasure and profit. His mathematical 
genius enabled him to devise simple rules of probability for 
his own benefit and for his gambling contemporaries. 


In this biography, Cardano’s gambling studies are de- 
ciphered for the first time, and a complete translation of the 
Book on Games of Chance is appended. 


“A first-rate contribution to the history of 
science.”—Scientific American 
264 pages. Illustrated, $4.00 


Order from your bookstore 
PRINCETON UNIVERSITY PRESS 














The Journal of 
Industrial Economics 


Editors: P. W. S. Andrews, Sir Henry Clay, and Professors Joel Dean, 
R. B. Heflebower, John Jewkes, and E. S. Mason 


Three issues yearly, Annual subscription, £3: single copies £1:40. 


This Anglo-American journal is primarily devoted to the economic problems of 
industry and commerce. It contains articles of a scientific character, of value 
both to economists and business men, and offers a medium for the academic 
industrial research work that is being carried out on an increasing scale in 
America and Britain. 


Contents of the last number: 


Operational Research, by I. S. LLOYD; The Influence of Economic Factors 
on the Location of Oil Refineries, by J. D. BUTLER; Integration in the Oil 
Industry, by P. H. Frankel; Industrial Economic Problems in the Post-War 
Aluminium Market in the U.S., by LEONARD A. DOYLE; The Problem of 
Introducing Modern Systems of Wage Payment into the Boot and Shoe In- 
dustry, by J. CRAWFORD; The Control and Oversight of Capital Expendi- 
ture Within Unilever, by L.G. NORTON & J. E. WALL. 


Send dollar check to 
BASIL BLACKWELL ¢ OXFORD ¢ ENGLAND 





Please mention the Journal of the AMERICAN Statistica Association in writing advertisers 

















B ook S The S tudy of 


Behavior 
fr O AL} Q-TECHNIQUE AND ITS METHODOLOGY 


By WILLIAM STEPHENSON, In a long- 
h 4 awaited work on the methodology of the 
C wag O study of human behavior, Professor Stephen- 
son presents a clear and comprehensive ex- 

position of his revolutionary “Q-Technique.” 


$7.50 
The Design of 


Social Research 


At your bookstore By RUSSELL L. ACKOFF. A textbook on 
or from the design of social experiments, based upon 

THE UNIVERSITY OF all developments in scientific methodology, 
CHICAGO PRESS written for all teachers who believe their 
5750 Ellis Ave, students must come to grips with empiricism 
Chicago 37, Ill. and empirical methods. $7.50 














THE JOURNAL OF MARKETING 


PUBLISHED QUARTERLY BY THE AMERICAN MARKETING ASSOCIATION 


Partial List of Contents, Vol. XVIII, No. 2, October 1953 


Can Advertising Markets be Defined or Measured as Geographical Areas 
iain 0.6:6:6:0:0:0:6,0:0:00:60.40.006.640.0050.60096000055000- 06050560 eee 


Critical Store Visits ..............-- ‘came baenee Aaneee pechiopaied David Carson 
An Appraisal of the BLS Consumer Price Index Olive E. Vaughan 
SRDS’ Estimates of Local Consumer Incomes ..........5++0+5 ...Jdoseph H. White 
Special Incentives for Salesmen .........+-s00% Albert Haring and Robert H. Myers 
Mortality of Seattle Grocery Wholesalers ........0sseeeseeeees ....Richard R. Still 


Subscription price, $4 per year in the U.S.; other countries, $5 per year. Address com- 
munications to: Albert W. Frey, Editor-in-Chief, Dartmouth College, Hanover, New 
Hampshire; Lincoln Clark, Managing Editor, New York University, 90 Trinity Place, 
New York 6, N.Y.; Thomas J. McGann, Business Manager, Iona College, New Rochelle, 
New York. 





Please mention the Journal of the AMERICAN STAtIstTICAL ASSOCIATION in writing advertisers 































Just Published . 


CAMBRIDGE ELEMENTARY 
STATISTICAL TABLES 


by D. V. Linptey and J. C. P. MILLER 


A convenient collection of the more familiar and elementary statistical 
functions and tests of significance. The tables contain: 





The Normal Distribution Percentage Points of the 
Function x?-Distribution 
The Normal Frequency 


Square Roots 
Reciprecals 
Conversion of Range to Reciprocal Square Roots 


Function Standard Deviation Lesediiues 
Percentage Points of the ‘ re 
Normal Distribution Percentage Points of the F- Antilogarithms 
. Distribution ’ 
Percentage Points of the t- Inverse Circular and 
Distribution Random Sampling Num- Hyperbolic Root-Sines 
Transformation of the Cor- bers Logarithms of Factorials 
relation Coefficient Squares Proportional Parts 


Price $1.00 CAMBRIDGE UNIVERSITY PRESS 
32 East 57th Street—New York 22 
































ECONOMETRICA 


Journal of the Econometric Society 
Contents of Vol. 21, No. 4—October 1953 


M. ALMA: *..0006600% Le Comportement de Il'Homme Rationnel devant la Risque: 
POUT TTT TCT C TTT TTT TT ritique des Postulats et Axiomes de ]’Ecole Americaine 

KARL y FOX: se: a tht ce i de a ad hs Di he A A aT he cc dase acai Rie oma 
Ay Equilibrium Model of the Livestock-Feed Economy in the United States 
WALTER D. FISHER: . .On a Pooling Problem from the Statistical Decision Viewpoint. 

A DIVORETZKEY, J. KIEFER, MD G.. WEEP WEEE, 6.5:6.0.0:00.0.04.060 06000058 
POETS On the Optimal Character of the (s,S) Policy in Inventory Theory 
GERARD — AND I. N, wy ol -edianins Nonnegative Square Matrices 
DAVID C. MCGARVEY: ..... A Theorem on the Construction of Voting Paradoxes 
REPORT OF LUCKNOW MEETING . ES POODLE TLIO Ce RTE OOP ee 
BOOK gl hg Ag~ Clair Mitchell, “The ry Scientist (Arthur F, Burns, 
Ed.). Review by R. Gordon; Introduction to the Theory of Games (J. C. C. Me- 
Kinsey). Review by Martin Beckmann; A Study of Moneyflows in the United States 
(Morris A. Copeland). Review by Clark Warburton; A Guide to Keynes (Alvin H. 
Hansen). Review by ioward R. Bowen; Collected Economic Papers (Joan Robinson). 
Review by K. J. Arrow; Wages and Salaries in the —, Kingdom 1920-88 (Agatha 
L. Chapman). Review by A. Henderson; The Trend of Government Activity in the 
United States since 1900 (Solomon Fabricant). Review by Theo. Surfnyi-Un er; Die 
Konjunkturschwankungen (Walter Adolph Jéhr). Review by K. W. Rothschild; Read- 
ings in Business Cycles and National Income (Alvin H. Hansen and Richard V. 
Clemence, Eds.). Review by William J. Baumol; Business ‘Oycles in the United King- 

dom, 1870-1914 (J. Tinbergen). Review by E. J. "H. Buckatzsch. 
COMMUNICATIONS 


Communication from Dickson H. Leavens Note on Membership Listin 
Announcement of Washington Meeting List of Proposed New Members 
Published Quarterly Subscription rate available on request 


The Econometric Society is an international society for the advancement of economic 
theory in its relation to statistics and mathematics. 

Subscriptions to Econometrica and inquiries about the work of the Society and the 
procedure in applying for inembership should be addressed to Rosson L. Cardwell, Acting 
Secretary, The Econometric Society, The University of Chicago, Chicago 37, Illinois, U.S.A. 








Please mention the Journal of the AMERICAN STATISTICAL ASSOCIATION in writing advertisers 














Take a Busman’'s Holiday and Learn 


How to Lie with 


BY DARRELL HUFF 


This lively dissertation on various methods of statistical pre- 
varication has more laughs per page (and more sound in- 
formation) than any other book of the same title. 
85 illustrations by Irving Geis. Coming Jan. 4 
$2.95 at all bookstores 


W. W. NORTON & CO., 101 Fifth Avenue, N. Y. 3 

















JOURNAL OF FARM ECONOMICS 
Published by 
THE AMERICAN FARM ECONOMIC ASSOCIATION 


Editor: Lawrence W. Witt 
Michigan State College, East Lansing, Michigan 


Volume XXXV November 1953 Number 4 
Linear Programming Applied to Feed-Mixing 

Walter D, Fisher and Leonard W. Schruben 
Hedging Reconsidered Holbrook Working 
Grain Roughage Substitution Emil Rauchenstein 
Proper Planning Reduces Research Error .........+02eeeee0e0- R. L, Anderson 
Mapiomnsal Tategretien Ih. TRCOGS «0. o.6.< so ccc cas case sacesemes J. H. Richter 
Agricultural Problems of the Middle East ............-000000- L, D. Schweng 
Analysis of Alternative Fertilizer-Yield Relationships Paul R. Johnson 


This Journal contains additional articles, notes, and book reviews and is published 
in February, May, August, November, and December. Yearly subscription $5.00. 


Secretary-Treasurer EARL BUTZ 
Department of Agricultural Economics 
Purdue University, Lafayette, Indiana 





Please mention the Journal of the AMERICAN STATISTICAL ASSOCIATION in writing advertisers 

















































































































































































voume 0 BIOMETRIK A Port: 3 onc 4 


CONTENTS 
The Popes, fin of species and the estimation of population parameters. By 


Capture-recapture analysis, By J. Mc. HAMMERSLEY 


The use of chain-binomials with a verte chance ¢ infection for the analysis of intra- 
household epidemics. By NORMAN T. J. BAILEY 


Spread of diseases in a rectangular plantation with vacancies. By G. H. FREEMAN 
Tests of significance for concurrent regression lines. By E. J. WILLIAMS 


Approximate confidence intervals. 1I. More than one unknown parameter. By M. S. 
BARTLETT 


Non-normality and tests on variances. By G. E. P. BOX 
Approximating to the distribution of measures of dispersion a er of x2. By J. H. 
CADWELL - won v3 


The power function of some tests based on range. By H. A. DAVID 
Some simple approximate tests for Poisson variates. By D. R. COX 
Orthogonal polynomial fitting. By JOHN WISHART and THEOCHARIS METAKIDES 


Population Stems between species growing according to simple birth and death proc- 
esses. By J. H. DARWIN 


Modifications to the cory A method. By M. H. QUENOUILLE 

Moments of the rank correlation coefficient 7 in the general case. By R. M. SUNDRUM 
99.9% and 0.1% points of the x*-distribution. By T. LEWIS 

Tables of Symmetric Functions, Part IV. By F. N. DAVID and M. G. KENDALL 
Miscellanea 

Some procedures for comparing Poisson processes or populations. By ALLAN BIRNBAUM 


Scale factors and degrees of freedom for small sample sizes for x-approximation to the 
range. By GEORGE WM. THOMSON 


The third moment of Gini’s mean difference. By A. R. KAMAT 

A method of systematic sampling based on order properties. By R. M. SUNDRUM 
A note on ordered least-squares estimation. By F. DOWNTON 

A note on the evaluation of the multivariate normal integral. By F. N. DAVID 


A graphical method for the couree of statistical distributions into two normal com- 
ponents. By ERIC J. PRESTON 


A note on regions for tests of kurtosis. By G. E. P. BOX 

The frequency justification of sequential tests—addendum. By G. A. BARNARD 

Reviews 
O. KEMPTHORNE’S “The design of analysis of experiments” 


M. H. QUENOUILLE’S “Associated Measurements” and “The design and analysis 
of experiment” 


Cc. I. BLISS’S “The Statistics of Bio-assay”: D, J. FINNEY’S “Statistical method 
in biological assay” and “Probit analysis” 


A. HALD’S “Statistical theory with engineering applications” and “Statistical tables 
and formulas” 


A. WALD’S “Statistical decision functions” 


NATIONAL BUREAU OF STANDARDS (i) “Table of arctangents of rational 
~~ (ii) “Tables ¢ the exponential function e*”’: (iii) “A guide to tables 
of the normal probability integral” : (iv) “Tables | of the Bessell Functions 
Yo(x),Y1(x), Ko(x), Ki(x):” (v) “Table of arctan x.” 


Biometrika 
University College, London 
Gower Street, W.C.1 
London, England 








Please mention the Journal of the American Statistica, Association in writing advertisers 

















The Annals of Mathematical Statistics 


THE OFFICIAL JOURNAL OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


Vol. 24, No. 3—September, 1953 


_— 
— 





An Essentially Complete Class of Decision Functions for Certain 
Standard Sequential Problems Milton Sobel 


Stochastic Processes Occurring in the Theory of Queues and Their 
Analysis by the Method of the Imbedded Markov Chain. ...... 
David G. Kendall 


On the Stochastic Matrices Associated with Certain Queuing Processes. 
0060-64.406b000606000400600606h0b6000ngR0Rn0SREASIEOM F. G. Foster 


Bounds on a Distribution Function When its First n Moments are 
WG 54-409.40000606000600608000000000050000080RRNT H. L. Royden 


On Some Theorems in Combinatories Relating to Incomplete Block 
BN, occ spp nnreniosadiaded.ecinnapamamaes Kulendra N. Majumdar 


The Behrens-Fisher Problem for Regression Coefficients. ..D. A. S. Fraser 


Sequential Decision Problems for Processes with Continuous Time 
Parameter. Problems of Estimation. .............eeeeeeeeeees 


pc eddedodadcemmaaaamns ean A. Divoretzky, J. Kiefer and J. Wolfowitz 


Distribution of Quadratic Forms and Ratios of Quadratic Forms. ... 
John Gurland 


Scaling and Error Analysis for Matrix Inversion by Partitioning. 
Mark Lotkin and Russell Remage 


On Certain Classes of Statistical Decision Procedures. .....H. S. Konijn 
A Double Sample Test Procedure. .............0.000: Donald B. Owen 


Some Tests Based on Ordered Observations from Two Exponential 
Populations. .......+..++++: Benjamin Epstein and Chia Kuei Tsao 


NOTES: 
Power Functions of the Sign Test and Power Efficiency for Normal 
BE, 16:65:06 0cdancccdanseeseakiapeaauadaasiand W. J. Dixon 


The Admissibility of Certain Invariant Statistical Tests Involving a 
Translation Parameter. ............- E. L, Lehmann and C. M. Stein 


A Note on Dodge’s Continuous Inspection Plan. ....Gerald J. Lieberman 


On the Power of a One-Sided Test of Fit for Continuous Probability 
II. 00:0.05000000080enbbbnsonssducesennnndhs Z. W. Birnbaum 


_— 
>" 








Address orders for subscriptions and back numbers to Professor K. J. 
Arnold, Secretary, Institute of Mathematical Statistics, Department 
of Mathematics, Michigan State College, East Lansing, Michigan 











Please mention the Journal of the American SratisticaL ASsociaTION in writing advertisers 

















POPULATION STUDIES 
A JOURNAL OF DEMOGRAPHY 
Editor: D. V. GLASS 
Vol. VII, No. 2 November 1953 


Summaries of Contents CONTENTS 


CLYDE V. KISER and P. K. WHELPTON—Résumé of the Indianapolis 
Study of Social and Psychological Factors Affecting Fertility 


JOHN HAJNAL—Age at Marriage and Proportions Marrying 


W. BRASS—The Derivation of Fertility and Reproduction Rates from Re- 
stricted Data on Reproductive Histories 


S. N. EISENSTADT—Analysis of Patterns of Immigration and Absorption 
of Immigrants 


C. J. MARTIN—Some Estimates of the General Age Distribution, Fertility 
and Rate of Natural Increase of the African Population of British East 
Africa 

Book Reviews 

Books and Publications Received. 

Subscription price per volume of three parts is $5.00 


Published for the PopULATION INVESTIGATION COMMITTEE by 


CAMBRIDGE UNIVERSITY PRESS 
32 East 57th Street, New York 22, N.Y. 














WANTED 


Back issues of the Journal of the American Statistical Associa- 
tion. If you can spare your copies of the following issues of 
the Journal, write to the Office of the Secretary of the Asso- 


ciation. 


Volume 36, No. 213, March 1941 
Volume 42, No. 238, June 1947 
Volume 42, No. 239, September 1947 
Volume 48, No. 261, March 1953 


Volume 48, No. 262, June 1953 


The American Statistical Association 
1108 16th Street, N.W. 
Washington 6, D.C. 





Please mention the Journal of the American Statistica AssociaTion in writing advertisers 

















JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 


VOLUME 48: 1953 
NUMBERS 261-264 


Published Quarterly by the 
AMERICAN STATISTICAL ASSOCIATION 
WASHINGTON, D. C. 

1953 





BUS. ADM, 
LIBRARY 


/ 
AS | 


JOURNAL OF THE AMERICA 
STATISTICAL ASSOCIATION 


The Editors welcome the submission of manuscripts for possible publication. 
They should be typewritten entirely double-spaced, including footnotes, and 
two copies should be sent to the Editor, W. Allea Wallis, 207 Haskell Hall, 
University of Chicago, Chicago 37. Books for review should be sent to the same 
address. Unsolicited book reviews are not accepted, but suggestions of titles for 
review are welcome. 


EDITOR 


W. ALLEN Watts, University of Chicago 
ASSISTANT TO THE Epitor: MarGARET A. LABADIE 


ASSOCIATE EDITORS 


Howarp L. JoNrEs Pup J. McCartuy 

Illinois Bell Telephone Co. Cornell University 
GrorGE M. Kuznets I. RicHarD SAVAGE 

University of California National Bureau of Standards 
WiiiramM G. Mapow C. ASHLEY WRIGHT 

University of Illinois Standard Oil Company (N.J.) 


ADVISORY PANEL OF FORMER EDITORS 


Wiuuiam G. Cocuran (1945-50)  #Franx A. Ross (1926-34, 41-45) 
Johns Hopkins University Thetford, Vermont 

WitumaM F. Ocsurn (1920-1925) Freperick F. StepHan (1935-40) 
University of Chicago Princeton University 


Errata: Readers and authors are urged to submit to the Editor notices of 
errors found in this or any previous volume. These will be published once 
a year, in the December issue. 





EDITORIAL COLLABORATORS 


A. G. ABRAMSON 
SKF Industries 
Forman 8S. AcTON 
Princeton University 
Rosert W. Apams 
Standard Oil Company of New Jersey 
Frep C. ANDREWS 
Stanford University 
KENNETH J. ARNOLD 
Michigan State College 
Leo A. AROIAN 
Hughes Aircraft Company 
KENNETH J. ARROW 
Stanford University 
Tuomas ATKINSON 
National Bureau of Economic Re- 
search 
CLIFFORD BACHRACH 
Johns Hopkins Universtiy 
Lotre L. BAILYN 
Harvard University 
T. A. BANCROFT 
Towa State College 
RoBERT BECHHOFER 
Cornell University 
JOSEPH BERKSON 
Mayo Clinic 
Max A. BERSHAD 
Bureau of the Census 
ALLAN BIRNBAUM 
Columbia University 
Z. W. BrirNBAUM 
University of Washington 
Davip BLACKWELL 
Howard University 
Davip BLANK 
Columbia University 
Dona.p J. Bocuz 
Miami University 
E. F. BorGattra 
Harvard University 
R. A. BRADLEY 
Virginia Polytechnic Institute 
EvMER V. Bratr 
Lehigh University 
Greorcse W. Brown 
International Telemeter Corporation 
K. A. BROWNLEE 
University of Chicago 
R. W. Burcess 
Bureau of the Census 
Irvine W. Burr 
Purdue University 
GLENN L. Burrows 
Bureau of Agricultural Economics 
JoserpH M. CAMERON 
National Bureau of Standards 


Dovatas G. CHAPMAN 

University of Washington 
HERMAN CHERNOFF 

Stanford University 
Car. CuRIstT 

Johns Hopkins University 
A. C. Coen, Jr. 

University of Georgia 
Wi.uraM 8S. ConNoR 

National Bureau of Standards 
WarREN N. CorDELL 

A. C. Nielsen Company 
JEROME CORNFIELD 

National Cancer Institute 
Dubey J. CowpEN 

University of North Carolina 
Crciu C. Craie 

University of Michigan 
DANIEL CREAMER 

National Bureau of Economic Re- 

search 
LeE J. CRONBACH 

University of Illinois 
Epwin L. Crossy 

Johns Hopkins University 
DANIEL CREAMER 

National Bureau of Economic Re- 

search 

S. Les Crump 

University of Rochester 
JOHN CURTISS 

New York University 
ELEANOR 8. DANIEL 

Mutual Life Insurance Company of 

New York 

HERBERT T. Davip 

University of Chicago 
JosepH 8S. Davrs 

Food Research Institute 
DaNniEL B. DELURY 

Ontario Research Foundation 
W. Epwarps DEMING 

New York University 
Paut M. DENSEN 

University of Pittsburgh 
Cyrus DERMAN 

Columbia University 
Witrrip J. Dixon 

University of Oregon 
Rosert DoRFMAN 

University of California (Berkeley) 
Haro.p F. Dorn 

National Institute of Health 
Francis W. Drescu 

Stanford Research Institute 





iv AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


CHARLES DUNNETT 

Cornell University 
Pau. S. DwyER 

University of Michigan 
CHURCHILL EISENHART 

National Bureau of Standards 
FREDERICK A. EKEBLAD 

Northwestern University 
Hors T. ELpripGE 

United Nations 
BENJAMIN EpstEIN 

Wayne University 
WILLIAM FELLER 

Princeton University 
CLARENCE B. FINE 

Bureau of Internal Revenue 
A. L. FINKNER 

North Carolina State College 
Lester R. FRANKEL 

Alfred Politz Research, Inc. 
Haroutp A. FREEMAN 

Massachusetts Institute of Technology 
Martin R. GAINSBRUGH 

National Industrial Conference Board 
SHELDON GLUECK 

Harvard University 
Leo A. GoopMAN 

University of Chicago 
B. G. GREENBERG 

University of North Carolina 
Utr GRENANDER 

University of Stockholm 
IrvinG I. GRINGORTEN 

Air Force Cambridge Research Labo- 

rator 

HAROLD GULLIKSEN 

Educational Testing Service 
Emit J. GUMBEL 

New School of Social Research 
JoHN GURLAND 

Towa State College 
MarGareEt J, Hacoop 

Bureau of Agricultural Economics 
Max HALPERIN 

National Heart Institute 
C. Horace HAMILTON 

North Carolina State College 
P. C. HAMMER 

University of Wisconsin 
FranK HARArRy 

University of Michigan 
Boyp HARSHBARGER 

Virginia Polytechnic Institute 
Harry P. HARTKEMEIER 

University of Missouri 
Miuiarp W. Hastay 

National Bureau of Economic Re- 

search 

A. F. HENRy 

Harvard University 
Jack HIrRSHLEIFER 

The Rand Corporation 


Water E. Hoap ey, Jr. 

Armstrong Cork Company 
J. L. Hopass, Jr. 

University of California (Berkeley) 
Wassity HoEFFpING 

University of North Carolina 
FREDERICK H. HOLLANDER 

Institute for Numerical Analysis 
Rospert HOooKe 

Princeton University 
Harry M. HuGHes 

University of California (Berkeley) 
Tor HULTGREN 

National Bureau of Economic Re- 

search 

JuLIUS JAHN 

State College of Washington 
Emit H. JEBE 

Iowa State College 
T. A. JEEVES 

University of California (Berkeley) 
R. J. JESSEN 

Iowa State College 
Byron JOHNSON 

University of Denver 
Leo Katz 

Michigan State College 
JACK KIEFER 

Cornell University 
Epaear P. Kine 

Eli Lilly and Company 
LAWRENCE KLEIN 

University of Michigan 
Cari Kossack 

Purdue University 
WituraM H. Kruskau 

University of Chicago 
ErneEst KurRNow 

New York University 
Simon Kuznets 

University of Pennsylvania 
Lotre LAZARSFELD 

Harvard University 
Paut LAZARSFELD 

Columbia University 
Tuomas LEHRER 

Harvard University 
Howarpb LEVENE 

Columbia University 
GERALD J. LIEBERMAN 

Stanford University 
JuLIus LIEBLEIN 

National Bureau of Standards 
Ricuarp Link 

Princeton University 
CLARENCE LONG 

Johns Hopkins University 
Freperick M. Lorp 

Educational Testing Service 
Irvine LorGE 

Columbia University 





EDITORIAL COLLABORATORS 


Duncan Luce 
Massachusetts Institute of Technology 
EvGENE Lukacs 
National Bureau of Standards 
Ruts P. Mack 
National Bureau of Economic Re- 
search 
GrorcE F. Mair 
Princeton University 
BENJAMIN J. MANDEL 
Social Security Administration 
Henry B. Mann 
Ohio State University 
NATHAN MANTEL 
National Institute of Health 
Harry MARKOWITZ 
The Rand Corporation 
Eu1 8S. Marks 
National Opinion Research Center 
ANDREW W. MARSHALL 
University of Chicago 
Psut MEIER 
Princeton University 
Horst MENDERSHAUSEN 
Federal Reserve Bank of New York 
MARGARET MERRELL 
Johns Hopkins University 
Max R. Mickey 
The Rand Corporation 
FREDERICK C. MILLS 
Columbia University 
GeorrreEY H. Moore 
National Bureau of Economics Re- 
search 
J. E. Morton 
Cornell University 
Lincotn Moses 
Stanford University 
Joun NETER 
Syracuse University 
J. NEYMAN 
University of California (Berkeley) 
HaRoLp NISSELSON 
Bureau of the Census 
GoTTFRIED NOETHER 
Boston University 
Frank W. NoTeEstTEIN 
Princeton University 
Davip Novick 
The Rand Corporation 
Epwin G. OLps 
Carnegie Institute of Technology 
Paut S. OLMSTEAD 
Bell Telephone Laboratories 
Guy Orcutt 
Harvard University 
Exus R. Ort 
Rutgers University 
Joun K. Perrin 
American Telephone and Telegraph 
Company 


Orto PoLLak 
University of Pennsylvania 
DaniE. O. Pricp 
University of North Carolina 
FRANK PROSCHAN 
Sylvania Electric Products Company 
JosEPH PUTTER 
University of California (Berkeley) 
Morton S. Rarr 
U. S. Bureau of Public Roads 
MarGarEt G. REID 
University of Chicago 
RosBert REID 
Harvard University 
Pau R. Riper 
Wright-Patterson Air Force Base 
J. A. RiGNEy 
North Carolina State College 
HERMAN RUBIN 
Stanford University 
Pau A. SAMUELSON 
Massachusetts Institute 
nology 
LESTER SARTORIUS 
University of Illinois 
LEONARD J. SAVAGE 
University of Chicago 
HENRY ScHEFFE 
University of California (Berkeley) 
Evizaseta L. Scorr 
University of California (Berkeley) 
STEPHEN SELIGMAN 
Harvard University 
Irvine H. SIEGE, 
The Twentieth Century Fund 
RoOsEDITH SITGREAVES 
——* University 
Joun H. Smita 
American University 
HERBERT SOLOMON 
Columbia University 
Rosert M. Sotow 
Massachusetts Institute of Technology 
MortTIMER SPIEGELMAN 
Metropolitan Life Insurance Com- 
pany 
WiuuraM A. Spurr 
Stanford University 
Henry W. STeinHnAUS 
Equitable Life Insurance Company 
CHARLES STEWART 
Bureau of Labor Statistics 
J. Stevens Stock 
McCann-Erickson, Inc. 
Frep L. StroptsEeck 
Yale University 
ZENON SzZATROWSKI 
University of Buffalo 
ConraD TAEUBER 
Bureau of the Census 
D. Tr1cHROEW 
Institute for Numerical Analysis 


of Tech- 





vi AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Dororuy S. THomas 
bead cian of Pennsylvania 
Donovan J. THOMPSON 
Towa State College 
Mary N. Torrey 
Bell Telephone Laboratories 
C. K. Tsao 
Wayne University 
Joun W. TuKEY 
Princeton University 
WituraMm S. VICKREY 
Columbia University 
Joun E. WatsH 
Naval Ordnance Test Station 
FREDERICK V. WAUGH 
Bureau of Agricultural Economics 
RoaeEr I. WILKINSON 
Bell Telephone Laboratories 


S. S. WiLks 

Princeton University 
Seymour L. Wo.FBEIN 

Bureau of Labor Statistics 
Crepric WOLFE 

Metropolitan Life Insurance Com- 

pany 

Ramsey Woop 

Federal Reserve Board 
Max Woopsury 

University of Pennsylvania 
JANE WORCESTER 

Harvard School of Public Health 
W. S. WoytTINnsky 

Twentieth Century Fund 
MaRvVIN ZELEN 

National Bureau of Standards 





INDEX TO VOLUME 48, 1953 


ARTICLES 


ANpERSON, R. L., Recent Advances in Finding Best Operating Conditions 

Barciay, W. D., and Kemprnorne, O., The Partition of Error in Ran- 
domized Blocks 

BERKSON, JOSEPH, A Statistically Procies ond Relatively ‘Simple Method of 
Estimating the Bio-Assay with Quantal Response, Based on the Logistic 
Function . ole 

Ber.E, ADOo;F A., Wesley Clair Mitchell: The ‘Economic Scientist 

Brown, J. A. C., HourHAxxze, H. S., and Prars, 8. J., Electronic Compu- 
tation in Besnoule Statistics 

Brown_eEg, K. A., Hopaes, J. L., Jr., and Rosans.ar, “Munnar, “The 
U eaadiien Method with Small Semple , 

Brunk, Max E., and Feprerer, WALTER T., Baperimental Designe ‘and 
Probability Seugling t in Marketing Dusen . 

BuRKHEAD, JESSE, Changes in the Functional Distribution of Pusomns 

Carter, Huan, Improving National Marriage and Divorce Statistics 

Cuark, Ropert E., Percentage Points of the Incomplete Beta Function. 

Cocuran, WiLi1AM G., MosTEe.LLeR, FREDERICK, and TuKEy, JoHn W., 
Statistical Probleme of the Kinsey Report . 

Craia, C. C., Combination of Neighboring Cells in Contingency Tables. 

DANIEL, C., Statistics in Chemical Experimentation 

DEMING, W. Epwarps, On the Distinction between Enumeratice ond Ane- 
lytic Surveys . . 

DemineG, W. Epwarps, On a ' Probability Meshentom to Attain an » Beenouste 
Balance Between Control of the Resultant Error of Response and the Bias 
of Nonresponse 

Duranpb, Davin, GUMBEL, E. J., and Gassxwoon, J. Anruue, The Chew 
lar Normal Distribution: Flewrs and Tables . 

Durstin, J., A Note on Regression when there is Batrancous Information 
about One of the Coefficients , 

Dwyer, Pau. S., and WavGH, Feepeatcn v., On Brreve in : Matriz pn 
sion . 

Dyke, G. V., and Huaty, M. J. R., A “Hollerith Technique for the Solution 
of N. wal Equations. 

Evxin, Jack M., Estimating the Ratio elects the Proportions of Two 
Classes when ‘One i is a Sub-class of the Other . 

Epstein, BENJAMIN, and SoBe, Miuton, Life Testing . 

FEDERER, WALTER T., and Brunk, Max E., Experimental Designs ‘ond 
Probability Sampling in Marketing Ressareh . . 

FerBER, Ropert, Measuring the Accuracy and Structure of Businesomen's 8 
Expectations . 

FisHEr, Jacos, Data for M ensuring the E Fectivenesi of Public I ncome-M. ain- 
tenance Programs 

Fo.iry, Donatp L., Census Tracts ond ‘Urten ‘Research . , 

Foote, RicHarp J., The Mathematical Basis of the Bean M sthod hed Graphic 
Multiple Corvelation * 


Garvy, Greorasg, The Velocity of Time Deposit 


935 





936 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1953 


Geary, R. C., Non-linear Functional Relationship between Two Variables 
when One Variable is Controlled 

Go.us, ABRAHAM, Designing Single-Sampling Jacpedtion Plans when the 
Sample Sizeis Fixed . . 

GoopMaN, Leo A., Methods of Measuring ‘Useful Life of Beuipment under 
Operational Conditions 

GREENWOOD, RoBeEnrt E., Probabilities @ Certain ‘Solitaire Card Games 

GrrEENwooD, J. ARTHUR, DuRAND, Davin, and GumBEL, E. J., The Circu- 
lar Normal Distribution: Theory and Tables . 

GumBEt, E. J., GREENwoop, J. ArTHUR, and DuRAND, Davi, “The tw 
lar Bacal Distribution: Theory and Tables . . . 

Heaty, M. J. R., and Dykes, G. V., A Hollerith Technique fer the Solution 
of Normal Beuatlens. 

Hopees, J. L., Jr., RosENBLATT, Munnar, and Baownzzs, K. a “The 
U o-and-Dows M ethod with Small Samples . 

HovutHakker, H. S., Prats, 8. J., and Brown, J. A. C., ‘Blestreni« Compu- 
tation in Beonemis Statistics 

Hyrentius, Hannes, On the Use of Bonses, Creen-Banaen, ond Batremes in 
Comparing Small Samples . 

Jones, Howarp L., Approximating the M ode from Weighted Sample Veluce 

Katz, LEo, Confidence Intervals for the Number Showing a Certain Charac- 
teristic in a Population when Sampling is without Replacement . 

KeMPTHORNE, O., and Barcuay, W. D., The Partition of Error in Random- 
ized Blocks 

KrmBa.u, Braprorp F., A Multiple Greup Least Bauares’ Problem ‘end the 
Significance of the Ascccteted Orthogonal Polynomials 2 . 2 

Kina, E. P., On Some Procedures for the Rejection of Suspected Data 

LADERMAN, :. Litraver, 8. B., and WeErtss, an The Inventory Prob- 
lem... 

LitraveEr, §. B., Weiss, ‘Lionat, and LADERMAN, ' The Inventory Pra- 
lem . 

MANDEL, B. . Sampling the Federal Old-age end ‘Burcteere I nsurance Ree- 
ords .. 

Margs, Et §&., Mauuom, W. ’ Panxza, ‘and Nisseis0n, Hanotp, The 
Post-Buumeration Survey of the 1950 Census: A Case History in Survey 
Design. , 

Mau.opin, W. Panuun, Nissaeon, Hano.p, ond Manus, Eur S., “The 
Post-Enumeration Survey of the 1950 Census: A Case History in Suiniys 
Design. —— 

MILieR, HERMAN P., yo Appretcal of the ‘1950 Conens Fucsne ‘Data ‘ 

MosHMaAN, JACK, Critical Values of the Log-normal Distribution 

MostTE.uER, FREDERICK, TUKEY, JoHN W., and CocHraNn, WILLIAM G., 
Statistical Problems of the Kinsey Revert . 

NissE.Lson, Harouip, Marks, Ex18., and MAuLpDIN, W. Panxzn, The Post- 
enumeration Survey of the 1950 Chase: A Case History in Survey De- 
sign. 

Prats, S. J., Baown, J. A. C, and Hovrnaxxsn, H. 8, ‘Biettrenie Compe 
tation ie Economic Statistics 

ProscHAaNn, FRANK, Confidence and Tolerance Intervals for the Normal Dis- 
Pe eee ee ee ee ee 


94 





INDEX TO VOLUME 48 


Ras, Des, Estimation of the Parameters of Type III ne from Trun- 
cated Samples. . 

Riper, Paut R., The Distr ibution of the Predust é Ranges in + Samples from 
a Rectangular Population 

Riper, Pau R., Truncated Poisson Distributions 

ROSENBLATT, Munnar, Brown .eE, K. A., and HopGEgs, J. Be Jn, The te 
and-Down Method with Small Sougies . 

RosHWALB, IrvinG, Effect of Weighting by Card-duplication ¢ on Efficiency of 
Survey Results . . 

SavaGE, I. RicHarp, Bibliography of N engeremtirie "Statistics ond Related 
Topics. eae Ba 

S1ece., Irvine H., kale Productivity é in the Soviet Union a" ‘ 

Simmons, WALT R., The Elements of an Industrial Classification Policy ; 

SosEL, MILTON, and EpsTEIn, BENJAMIN, Life Testing . 

Tuxey, Joun W., Cocnuran, Wiiu1aM G., and MostTe LER, Faupanicx, 
Statistical Predieme of the Kinsey Repet . 

VininG, Rut.epae, Delimitation of Economic Areas: Statistical Conceptions 
in the Study of the Spatial Structure of an Economic System 

Wavau, Dan F., and Wavuau, Frepericx V., On Probabilities in Bridge 

Wauaa, FRepenicx V., and Dwyer, Paut S., On Errors in Matrix Inver- 
+. K ef & wee ee ee. ee Oe Oe eee 

Wavaou, Frepericx V., and Wavucu, Dan F., On Probabilities in Bridge 

Weiter, H., The Use of Runs to Control the Mean in Quality Control . 

Weiss, Lione,, LapErRMAN, J., and Lirraver, S. B., The Inventory Prob- 
lem P 

WICKENS, ARYnuss Jor, ‘Statistics ond the Public ‘Feterest 


BOOK REVIEWS 


ANDERSON, R. L., and Bancrort, T. A., Statistical Theory in Research 
Cl eae See ae ee oe CHURCHILL EISENHART 
AsuBy, W. Ross, Design fora Brain . . . A. 8. HousEHOLDER 

Bancrort, T. A., and ANDERSON, R. L., Statistical Theory in Research 
; CHURCHILL EISENHART 
BELcune, Joun C., and ‘Suarr, Euurr F, A Short Scale for Measuring 
Farm Level of Living: A Modification of Sewell’s Socio-Economic Scale 
, Frep L. StroptBeck 

Banner, Munan K,, Population, Food, ond Economic Progress . 

Couin CLARE 


Bian, Morais Myzns, Blementary Statistics, Revised 


. = a en ce Lae oe el ee R. Ciay SpRowLS 
Boaus, Dona.p J., State Economic Areas . . . RvutvepGe VINING 
BuREAU OF THE CENSUS AND BoarD OF GOVERNORS OF THE FEDERAL RE- 
SERVE System, Census of Manufactures: 1947: Indexes of Produc- 
tion . , Paut B. Simpson 
CasEY, Rossar S., and Penay, ‘Jann W,, editors, Punched Cards: Their 
Applications to Science and Industry . . Harry P. HarTKEMEIER 
CuarK, CHARLES E., An Introduction to Statistics . Z.S. MALINOWSKI 
Coomss, CiypE H., A Theory of Psychological Scaling . Bert F. GREEN 

Firestone, O. J., Private and Public Investment in Canada . os 
. 2 owe ee 2 ee Hastay 





938 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 10953 


FREUND, JouN E., Modern Elementary Statistics . . NorMAN Rupy 
Garrett, H. E., Statistics in Psychology and Education ‘ 
. FREDERIC M. Lorp 
Gunecusnxnon, ALEXANDER, A Dollar Indes of Soviet Machinery Output, 
1927-28 t0 1987 . . . i wh JOHN CRAWFORD 
Gorpon, Rosert A., Business Fluctuations Beas Henry H. VILuarp 
Gore, W. L., Statistical Methods for Chemical Experimentation , 
a ae ee ee eee ee ee ee C. DANIEL 
Grant, EvGENE L., Statistical Quality Control, Second Edition ‘ 
— C. C. Crara 
Haeoon, Mancaner Januan, and Paice, Danret. O., Statistics for Sociolo- 
gists, Revised Edition . . . . A. W. MarsHatyi 
Hatcrow, Haro.p G., Agricultural Policy of the United States ‘ 
‘ Ivan M. Len 
Hato, A. , Statistical Theory with Bagincering A plications ‘ ‘ 
« Be LizpERMan 
Hovss ¢ OF REPResenraTives, Spactas Suncounrrren OF THE CoMMIT- 
TEE ON EDUCATION AND LaBor, Consumers’ Price Index 
— JULES Backman 
Inerrrurs | FOR , Economie Ressancn, ‘Toro SpPrInNING Company, Causes 
of Decline in the World’s Cotton Tezxtile Trade 
i Po See ka eck oaks Hee "Kaa A. Fox 
JAMBUNATHAN, M. V., The Theory of Linear Estimation . — 
Wi.uiraM G. Mapow 
Katona, Groncz, Pendiclegioal Anslgeie of Beenomie Behavior 
MarGaret G. Rew 


Kisse.aorr, Avnam, “Pactors Affecting the Demand for Consumer Install- 
ment Credit . . . . «. . Grorce E. O’Rourk 
LamsgB, C. G., Elements of Statistics as , Cart F. Kossack 
McEntTIRE, Davis, The Labor Force in California: A Study of Character- 
istics in Labor Force, Employment and Occupations in California, 1900- 
a «+: .* « ‘ Guapys L. PALMER 
McKinsey, J. C. C., Introduction to the Theory of Gomes 


Irwin Bross 
McMuten, Warne, Statistical M ethods fer Social Workers 
DANIEL 0. Price 
Monaman, Tuomas P., The Pattern of Age at M erviegs in the United States, 
Velea.TendiI ... =. - e 4 Pau. H. JacosBson 
Moroney, M. J., Facts from Figures » « « « « MM &, Gansomece 
MUELLER, Rosear Kirk, Effective Management through Probability Con- 
trols: How to Calculate Managerial Risks. . . . . J.C. Bain 
Pau. 8S. OLMsTEAD 
NATIONAL Bureau OF y Sranpanps, A Guide to Tables of the Normal Prob- 
ability Integral ' 
Nysu&én, Géran, The Problem of Summation’ in s Reonomie. Science 
oa JOHN Curpman 
ORGANISATION. OF " BuroPEan Economrc ‘Coormnamion, Measurement of 
Productivity . . . . . «. « « « « « Peter O. STEINER 


367 
912 


157 
159 


476 
363 
362 


663 





INDEX TO VOLUME 48 


Perry, James W., and Casey, Roser S., editors, Punched Cards: Their 
Applications to Science and Industry . . Harry P. HARTKEMEIER 
Price, Danrex O., and Hacoop, Maraaret JARMAN, Statistics for Soci- 
ologists, Revieed Edition . . . . » A. W. MarsHALu 
Ramon, Rosert L., and Touues, N. ARNoLp, Sources of Wage Information 


, M. I. GERSHENSON 
Rao, C. R., , Advanced Statistical M ethode is in Biometric Research 
a a Oe RoseEpDITH SITGREAVES 
Revue de Statistique A ppliqdes ‘ 
Rios, S1xto, Introduccién a los métodos de la cstaittalies. 1* Parte . , 
, Pau. R. Hatuos 
Rosinson, Mai ARILYN Dever, Washington State Statistical Abstract . . 
Pau Simpson 
Suanr, Ewuarr F., and BErcuzn, Joun C., A Short Scale for Measuring 
Farm Level of Living: A Modification of Sewell’s Socio-Economic Scale 
‘ Frep L. StRoDTBECK 
SiecEL, ‘Irvine H. , Concepts ond M casurement of Production and Produc- 
tivity . . . . . ArtHuR L. Bromwa 
SWEDEN, SravisTisEA Cunrnatsrais, "Statistik Tidskrift , 
TINTNER, GERHARD, Econometrics . . . .  Danret B. Surrs 
Touues, N. ARNOLD, and RAIMON, Roserr L., ‘Sources of Wage Information 
° . M. I. GersHENSON 
Wa.xse, Haen M., “M cthematies Essential for Elementary Statistics, Re- 
vised Edition. . . . Leo A. GoopMAN 
Witson, E. Brieut, An Introduction te Scientific Research . 
_— | LEBMANN 





Bachman, Jules 

Bain, J.C.. . . 
Broida, Arthur L.. 
Bross, Irwin . 
Chipman, John 
Clark, Colin 

Craig, C.C. . 
Crawford, John 
Daniel,C.. . . . 
Eisenhart, Churchill . 
For, BaniA. . . 
Gershenson, M. I. 
Girschick, M. A. . 
Goodman, Leo 
Green, Bert F. 
Halmos, Paul R. . 


Hartkemeier, Harry P. 


Hastay, Millard 

Householder, A. S. 
Jacobson, Paul H. 
Kossack, Carl F. . 


LIST OF REVIEWERS 


156 
659 
924 
655 
917 
374 
363 
157 
476 
359 
665 
915 
645 
153 
657 
154 
372 
162 
669 
668 
153 


Lee, Ivan M. . 
Lehmann, E. L. 
Lieberman, G. J. . 
Lord, Frederic M._ . 
Madow, William G. . 
Malinowski, Z. S.. 
Marshall, A. W. . 
Olmstead, PaulS. . 
O’Rourke, George E.. 
Palmer, Gladys L. 
Price, Daniel O. . 
Reid, Margaret G. 
Rudy, Norman 
Simpson, Paul B. 
Sitgreaves, Rosedith . 
Sprowls, R. Clay . 
Steiner, PeterO.. . 
Strodtbeck, Fred L. . 
Suits, Daniel B. 
Villard, Henry H.. 
Vining, Rutledge . 





te A EE RS ey. Ags 


ANS te Pega TERE te NR OER GIG 





MODERN ELEMENTARY STATISTICS 
By John E. Freund, Alfred University 


Designed for students in the social and natural sciences who have very 
little background in mathematics. It emphasizes the meaning of statistics 
rather than the acquisition of mathematical skills. Theoretical distribu- 
tions are introduced as early as Chapter 3 on a more or less intuitive 
level. Chapter 7 has a discussion of ‘and places repeated emphasis on the 
meaning of probability statements. For the first time, the modern theory 
of the testing of hypotheses is presented on the non-technical level. 

418 pages ~ 6" % 84%” Published 1952 


APPLIED GENERAL STATISTICS 


By Frederick E. Croxton, Columbia University; and 
Dudley J. Cowden, University of North Carolina 


Here is a remarkably comprehensive treatment of statistical methods and 
their application to many fields. Illustrative material has been drawn mainly 
from the fields of economics, sociology and business, and are all actual 
examples, rather than artificial cases contrived for the purpose. Data used 
for the more than 250 charts are the result of actual investigation. 

944 pages 544" x 844" _I Illustrated 


PRACTICAL BUSINESS STATISTICS, 2nd Edition 


By Frederick E. Croxton, Columbia University, and 
Dudley J. Cowden, University of North Carolina 


This text first explains foundation principles in window-clear style, and 
demonstrates their use in real life situations taken from actual firms. In 
effect, the student works out the kind of problems a business statistician 
would have to handle, The material is ideally suited for the individual who 
does not intend to become a statistician but does need a grasp of statistics 
for his career in modern business. High school algebra is the only pre- 
requisite. es 
550 pages 6’ 2x9” Published 1948 


PRENTICE-HALL, INC. 70 FIFTH AVENUE. NEW YORK 11 


GEORGE BANTA PUBLISHING COMPANY, MENASHA, WISCONSIN 





A aon 





